cdm_reader_mapper.DataBundle#

class cdm_reader_mapper.DataBundle(*args, **kwargs)[source]#

Class for manipulating the MDF data and mapping it to the CDM.

Parameters:

data (pd.DataFrame or Iterable[pd.DataFrame], optional) – MDF DataFrame.
columns (pd.Index, pd.MultiIndex or list, optional) – Column labels of data
dtypes (pd.Series or dict, optional) – Data types of data.
parse_dates (list or bool, optional) – Information how to parse dates on data
mask (pandas.DataFrame, optional) – MDF validation mask
imodel (str, optional) – Name of the MFD/CDM data model.
mode (str) – Data mode (“data” or “tables”) Default: “data”

Examples

Getting a DataBundle while reading data from disk.

>>> from cdm_reader_mapper import read_mdf
>>> db = read_mdf(source="file_on_disk", imodel="custom_model_name")

Constructing a DataBundle from already read MDf data.

>>> from cdm_reader_mapper import DataBundle
>>> read = read_mdf(source="file_on_disk", imodel="custom_model_name")
>>> data_ = read.data
>>> mask_ = read.mask
>>> db = DataBundle(data=data_, mask=mask_)

Constructing a DataBundle from already read CDM data.

>>> from cdm_reader_mapper import read_tables
>>> tables = read_tables("path_to_files").data
>>> db = DataBundle(data=tables, mode="tables")

__init__(*args, **kwargs)[source]#

Methods

`__init__`(args, *kwargs)
`add`(addition[, inplace])	Adding information to a `DataBundle`.
`copy`()	Make deep copy of a `DataBundle`.
`correct_datetime`([imodel, inplace])	Correct datetime information in `data`.
`correct_pt`([imodel, inplace])	Correct platform type information in `data`.
`duplicate_check`([inplace])	Duplicate check in `data`.
`flag_duplicates`([inplace])	Flag detected duplicates in `data`.
`get_duplicates`(**kwargs)	Get duplicate matches in `data`.
`map_model`([imodel, inplace])	Map `data` to the Common Data Model.
`remove_duplicates`([inplace])	Remove detected duplicates in `data`.
`replace_columns`(df_corr[, subset, inplace])	Replace columns in `data`.
`select_where_all_false`([inplace, do_mask])	Select rows from `data` where all column entries in `mask` are False.
`select_where_all_true`([inplace, do_mask])	Select rows from `data` where all column entries in `mask` are True.
`select_where_entry_isin`(selection[, ...])	Select rows from `data` where column entries are in a specific value list.
`select_where_index_isin`(index[, inplace, ...])	Select rows from `data` where indexes within a specific index list.
`split_by_boolean_false`([do_mask])	Split `data` by rows where all column entries in `mask` are False.
`split_by_boolean_true`([do_mask])	Split `data` by rows where all column entries in `mask` are True.
`split_by_column_entries`(selection[, do_mask])	Split `data` by rows where column entries are in a specific value list.
`split_by_index`(index[, do_mask])	Split `data` by rows within specific index list.
`stack_h`(other[, datasets, inplace])	Stack multiple `DataBundle`'s horizontally.
`stack_v`(other[, datasets, inplace])	Stack multiple `DataBundle`'s vertically.
`unique`(**kwargs)	Get unique values of `data`.
`validate_datetime`([imodel])	Validate datetime information in `data`.
`validate_id`([imodel])	Validate station id information in `data`.
`write`([dtypes, parse_dates, encoding, mode])	Write `data` on disk.

Attributes

`columns`	Column labels of `data`.
`data`	MDF pandas.DataFrame data.
`dtypes`	Dictionary of data types on `data`.
`encoding`	A string representing the encoding to use in the `data`.
`imodel`	Name of the MDF/CDM input model.
`mask`	MDF pandas.DataFrame validation mask.
`mode`	Data mode.
`parse_dates`	Information of how to parse dates in `data`.

cdm_reader_mapper.DataBundle

Contents

cdm_reader_mapper.DataBundle#