cdm_reader_mapper.core package

Common Data Model core package.

Submodules

cdm_reader_mapper.core._utilities module

Common Data Model (CDM) DataBundle class.

class cdm_reader_mapper.core._utilities.SubscriptableMethod(func)[source]

Bases: object

Allows both method calls and subscript access.

Parameters:

func (Any) – Underlying callable or subscriptable object.

cdm_reader_mapper.core._utilities.combine_attribute_values(first_value, iterator, attr)[source]

Collect values of an attribute across all chunks and combine them.

Parameters:
  • first_value (Any) – The value from the first chunk (already read).

  • iterator (Iterator/ParquetStreamReader) – The stream positioned at the second chunk.

  • attr (str) – The attribute name to fetch from remaining chunks.

Return type:

Any

Returns:

Any – Combined attribute values of iterator.

cdm_reader_mapper.core._utilities.method(attr_func, *args, **kwargs)[source]

Handle both method calls and subscriptable attributes.

Parameters:
  • attr_func (Any) – A callable object (e.g., function or method) or a subscriptable object (e.g., list, tuple, dict, or array-like).

  • *args (Any) – Positional arguments passed to attr_func, or used as the index/key when attr_func is subscriptable.

  • **kwargs (Any) – Keyword arguments passed to attr_func. Ignored if attr_func is not callable.

Return type:

Any

Returns:

Any – The result of calling attr_func(*args, **kwargs) if it is callable, or the result of attr_func[args] if it is subscriptable.

Raises:

ValueError – If attr_func is neither callable nor subscriptable, or if indexing fails due to an invalid key or index.

cdm_reader_mapper.core._utilities.reader_method(data, attr, *args, process_kwargs=None, **kwargs)[source]

Handle operations on chunked data (ParquetStreamReader).

Uses process_disk_backed to stream processing without loading into RAM.

Parameters:
  • data (pd.DataFrame or ParquetStreamReader) – Input data to operate on.

  • attr (str) – Name of attribute or method of to apply.

  • *args (Any) – Positional arguments passed to the attribute or method.

  • process_kwargs (dict, optional) – Additional keyword arguments passed to the streaming processor.

  • **kwargs (Any) – Keyword arguments passed to the attribute or method. Supports inplace to update db instead of returning a result.

Return type:

ParquetStreamReader | None

Returns:

ParquetStreamReader or None – A new stream with the applied operation.

cdm_reader_mapper.core.databundle module

Common Data Model (CDM) DataBundle class.

class cdm_reader_mapper.core.databundle.DataBundle(data=None, columns=None, dtypes=None, parse_dates=None, encoding=None, mask=None, imodel=None, mode='data')[source]

Bases: object

Container for tabular data and associated metadata.

This class wraps either an in-memory pd.DataFrame or a ParquetStreamReader for chunked, disk-backed processing. It provides a unified interface for accessing DataFrame-like attributes and methods, transparently handling streaming data where required.

Parameters:
  • data (pandas.DataFrame or Iterable[pandas.DataFrame] or ParquetStreamReader, optional) – Input data. If an iterable is provided, it is converted into a ParquetStreamReader for streaming.

  • columns (pandas.Index or pandas.MultiIndex or list, optional) – Column labels used when initializing empty data.

  • dtypes (pandas.Series or dict, optional) – Data types for columns.

  • parse_dates (list or bool, optional) – Instructions for parsing dates.

  • encoding (str, optional) – Encoding associated with the data.

  • mask (pandas.DataFrame or Iterable[pandas.DataFrame] or ParquetStreamReader, optional) – Boolean mask aligned with data. If not provided, an empty mask is created.

  • imodel (str, optional) – Name of the input data model.

  • mode ({"data", "tables"}, default "data") – Data representation mode.

Examples

Getting a DataBundle while reading data from disk.

>>> from cdm_reader_mapper import read_mdf
>>> db = read_mdf(source="file_on_disk", imodel="custom_model_name")

Constructing a DataBundle from already read MDf data.

>>> from cdm_reader_mapper import DataBundle
>>> read = read_mdf(source="file_on_disk", imodel="custom_model_name")
>>> data_ = read.data
>>> mask_ = read.mask
>>> db = DataBundle(data=data_, mask=mask_)

Constructing a DataBundle from already read CDM data.

>>> from cdm_reader_mapper import read_tables
>>> tables = read_tables("path_to_files").data
>>> db = DataBundle(data=tables, mode="tables")
add(addition, inplace=False)[source]

Adding information to a DataBundle.

Parameters:
Return type:

DataBundle | None

Returns:

DataBundle or None – DataBundle with added information or None if “inplace=True”.

Examples

>>> tables = read_tables("path_to_files")
>>> db = db.add({"data": tables})
property columns: pandas.core.indexes.base.Index | pandas.core.indexes.multi.MultiIndex

Column labels of data.

Returns:

pd.Index or pd.MultiIndex – Column labels of the underlying MDf data.

copy()[source]

Make deep copy of a DataBundle.

Return type:

DataBundle

Returns:

DataBundle – Copy of a DataBundle.

Examples

>>> db2 = db.copy()
correct_datetime(imodel=None, inplace=False, **kwargs)[source]

Correct datetime information in data.

Parameters:
  • imodel (str, optional) – Name of the MFD/CDM data model.

  • inplace (bool, default: False) – If True overwrite data in DataBundle else return a copy of DataBundle with datetime-corrected values in data.

  • **kwargs (Any) – Additional keyword-arguments for correcting datetime.

Return type:

DataBundle | None

Returns:

DataBundle or None – DataBundle with corrected datetime information or None if “inplace=True”.

See also

DataBundle.correct_pt

Correct platform type information in data.

DataBundle.validate_datetime

Validate datetime information in data.

DataBundle.validate_id

Validate station id information in data.

Notes

For more information see correct_datetime()

Examples

>>> df_dt = db.correct_datetime()
correct_pt(imodel=None, inplace=False, **kwargs)[source]

Correct platform type information in data.

Parameters:
  • imodel (str, optional) – Name of the MFD/CDM data model.

  • inplace (bool, default: True) – If True overwrite data in DataBundle else return a copy of DataBundle with platform-corrected values in data.

  • **kwargs (Any) – Additional keyword-arguments for correcting platform type.

Return type:

DataBundle | None

Returns:

DataBundle or None – DataBundle with corrected platform type information or None if “inplace=True”.

See also

DataBundle.correct_datetime

Correct datetime information in data.

DataBundle.validate_id

Validate station id information in data.

DataBundle.validate_datetime

Validate datetime information in data.

Notes

For more information see correct_pt()

Examples

>>> df_pt = db.correct_pt()
property data: pandas.core.frame.DataFrame | ParquetStreamReader

Underlying MDF data.

Returns:

pd.DataFrame or ParquetStreamReader – Underlying MDf data.

property dtypes: pandas.core.series.Series | dict[str, Any] | None

Dictionary of data types on data.

Returns:

pd.Series or dict or None – Data types of underlying MDF data.

duplicate_check(inplace=False, **kwargs)[source]

Duplicate check in data.

Parameters:
  • inplace (bool, default: False) – If True overwrite data in DataBundle else return a copy of DataBundle with data as CDM tables.

  • **kwargs (Any) – Additional keyword-arguments for duplicate check.

Return type:

DataBundle | None

Returns:

DataBundle or None – DataBundle containing new DupDetect class for further duplicate check methods or None if “inplace=True”.

See also

DataBundle.get_duplicates

Get duplicate matches in data.

DataBundle.flag_duplicates

Flag detected duplicates in data.

DataBundle.remove_duplicates

Remove detected duplicates in data.

Notes

Following columns have to be provided:

  • longitude

  • latitude

  • primary_station_id

  • report_timestamp

  • station_course

  • station_speed

This adds a new class DupDetect to DataBundle. This class is necessary for further duplicate check methods.

For more information see duplicate_check()

Examples

>>> db.duplicate_check()
property encoding: str | None

A string representing the encoding to use in the data.

Returns:

str or None – String representing the encoding to use in the underlying MDF data.

See also

pd.to_csv()

Write data with encoding to CSV file.

flag_duplicates(inplace=False, **kwargs)[source]

Flag detected duplicates in data.

Parameters:
  • inplace (bool, default: False) – If True overwrite data in DataBundle else return a copy of DataBundle with data containing flagged duplicates.

  • **kwargs (Any) – Additional keyword-arguments for flagging duplicates.

Return type:

DataBundle | None

Returns:

DataBundle or None – DataBundle containing duplicate flags in data or None if “inplace=True”.

Raises:

RuntimeError – Before flagging duplicates, a duplictate check has to be done, DataBundle.duplicate_check().

See also

DataBundle.remove_duplicates

Remove detected duplicates in data.

DataBundle.get_duplicates

Get duplicate matches in data.

DataBundle.duplicate_check

Duplicate check in data.

Notes

For more information see DupDetect.flag_duplicates()

Examples

Flag duplicates without overwriting data.

>>> flagged_tables = db.flag_duplicates()

Flag duplicates with overwriting data.

>>> db.flag_duplicates(inplace=True)
>>> flagged_tables = db.data
get_duplicates(**kwargs)[source]

Get duplicate matches in data.

Parameters:

**kwargs (Any) – Additional keyword-arguments used for getting duplicates.

Return type:

DataFrame

Returns:

pd.DataFrame – DataFrame containing duplicate matches.

Raises:

RuntimeError – Before getting duplicates, a duplictate check has to be done, DataBundle.duplicate_check().

See also

DataBundle.remove_duplicates

Remove detected duplicates in data.

DataBundle.flag_duplicates

Flag detected duplicates in data.

DataBundle.duplicate_check

Duplicate check in data.

Notes

For more information see DupDetect.get_duplicates()

Examples

>>> matches = db.get_duplicates()
property imodel: str | None

Name of the MDF/CDM input model.

Returns:

str or None – Name of the MDF/CDM input model if available.

map_model(imodel=None, inplace=False, **kwargs)[source]

Map data to the Common Data Model.

Parameters:
  • imodel (str, optional) – Name of the MFD/CDM data model.

  • inplace (bool, default: False) – If True overwrite data in DataBundle else return a copy of DataBundle with data as CDM tables.

  • **kwargs (Any) – Additional keyword-arguments for mapping to CDM.

Return type:

DataBundle | None

Returns:

DataBundle or None – DataBundle containing data mapped to the CDM or None if inplace=True.

Notes

For more information see map_model()

Examples

>>> cdm_tables = db.map_model()
property mask: pandas.core.frame.DataFrame | ParquetStreamReader

MDF validation mask.

Returns:

pd.DataFrame or ParquetStreamReader – Validation mask of the underlying MDF data.

property mode: str

Data mode.

Returns:

str – Current data mode.

Raises:

TypeError – If mode of the underlying data is not a string.

property parse_dates: list[Any] | bool | None

Information of how to parse dates in data.

Returns:

list or bool or None – Information of how to parse dates in underlying MDF data.

See also

pd.read_csv()

Read CSV file using pandas.

remove_duplicates(inplace=False, **kwargs)[source]

Remove detected duplicates in data.

Parameters:
  • inplace (bool, default: False) – If True overwrite data in DataBundle else return a copy of DataBundle with data containing no duplicates.

  • **kwargs (Any) – Additional keyword-arguments used to remove duplicates.

Return type:

DataBundle | None

Returns:

DataBundle or None – DataBundle without duplicated rows or None if “inplace=True”.

Raises:

RuntimeError – Before removing duplicates, a duplictate check has to be done, DataBundle.duplicate_check().

See also

DataBundle.flag_duplicates

Flag detected duplicates in data.

DataBundle.get_duplicates

Get duplicate matches in data.

DataBundle.duplicate_check

Duplicate check in data.

Notes

For more information see DupDetect.remove_duplicates()

Examples

Remove duplicates without overwriting data.

>>> removed_tables = db.remove_duplicates()

Remove duplicates with overwriting data.

>>> db.remove_duplicates(inplace=True)
>>> removed_tables = db.data
replace_columns(df_corr, subset=None, inplace=False, **kwargs)[source]

Replace columns in data.

Parameters:
  • df_corr (pd.DataFrame) – Data to be inplaced.

  • subset (str or list of str, optional) – Select subset by columns. This option is useful for multi-indexed data.

  • inplace (bool, default: False) – If True overwrite data in DataBundle else return a copy of DataBundle with replaced column names in data.

  • **kwargs (Any) – Additional keyword-arguments for replacing columns.

Return type:

DataBundle | None

Returns:

DataBundle or None – DataBundle with replaced column names or None if “inplace=True”.

Notes

For more information see replace_columns()

Examples

>>> import pandas as pd
>>> df_corr = pd.read_csv("correction_file_on_disk")
>>> df_repl = db.replace_columns(df_corr)
select_where_all_false(inplace=False, do_mask=True, **kwargs)[source]

Select rows from data where all column entries in mask are False.

Parameters:
  • inplace (bool, default: False) – If True overwrite data in DataBundle else return a copy of DataBundle with invalid values only in data.

  • do_mask (bool, default: True) – If True also do selection on mask.

  • **kwargs (Any) – Additional keyword-arguments for splitting data where all entries are False.

Return type:

DataBundle | None

Returns:

DataBundle or None – DataBundle containing rows where all column entries in mask are False or None if inplace=True.

See also

DataBundle.select_where_all_true

Select rows from data where all entries in mask are True.

DataBundle.select_where_entry_isin

Select rows from data where column entries are in a specific value list.

DataBundle.select_where_index_isin

Select rows from data within specific index list.

Notes

For more information see split_by_boolean_false()

Examples

Select without overwriting the old data.

>>> db_selected = db.select_where_all_false()

Select valid values only with overwriting the old data.

>>> db.select_where_all_false(inplace=True)
>>> df_selected = db.data
select_where_all_true(inplace=False, do_mask=True, **kwargs)[source]

Select rows from data where all column entries in mask are True.

Parameters:
  • inplace (bool, default: False) – If True overwrite data in DataBundle else return a copy of DataBundle with valid values only in data.

  • do_mask (bool, default: True) – If True also do selection on mask.

  • **kwargs (Any) – Additional keyword-arguments for splitting data where all entries are True.

Return type:

DataBundle | None

Returns:

DataBundle or None – DataBundle containing rows where all column entries in mask are True or None if inplace=True.

See also

DataBundle.select_where_all_false

Select rows from data where all entries in mask are False.

DataBundle.select_where_entry_isin

Select rows from data where column entries are in a specific value list.

DataBundle.select_where_index_isin

Select rows from data within specific index list.

Notes

For more information see split_by_boolean_true()

Examples

Select without overwriting the old data.

>>> db_selected = db.select_where_all_true()

Select overwriting the old data.

>>> db.select_where_all_true(inplace=True)
>>> df_selected = db.data
select_where_entry_isin(selection, inplace=False, do_mask=True, **kwargs)[source]

Select rows from data where column entries are in a specific value list.

Parameters:
  • selection (dict) – Keys: Column names in data. Values: Specific value list.

  • inplace (bool, default: False) – If True overwrite data in DataBundle else return a copy of DataBundle with selected columns only in data.

  • do_mask (bool, default: True) – If True also do selection on mask.

  • **kwargs (Any) – Additional keyword-arguments for splitting data where entries within a specific value list.

Return type:

DataBundle | None

Returns:

DataBundle or None – DataBundle containing rows where column entries are in a specific value list or None if inplace=True.

See also

DataBundle.select_where_index_isin

Select rows from data within specific index list.

DataBundle.select_where_all_true

Select rows from data where all entries in mask are True.

DataBundle.select_where_all_false

Select rows from data where all entries in mask are False.

Notes

For more information see split_by_column_entries()

Examples

Select without overwriting the old data.

>>> db_selected = db.select_where_entry_isin(
...     selection={("c1", "B1"): [26, 41]},
... )

Select with overwriting the old data.

>>> db.select_where_entry_isin(selection={("c1", "B1"): [26, 41]}, inplace=True)
>>> df_selected = db.data
select_where_index_isin(index, inplace=False, do_mask=True, **kwargs)[source]

Select rows from data where indexes within a specific index list.

Parameters:
  • index (list of int) – Specific index list.

  • inplace (bool, default: False) – If True overwrite data in DataBundle else return a copy of DataBundle with selected rows only in data.

  • do_mask (bool, default: True) – If True also do selection on mask.

  • **kwargs (Any) – Additional keyword-arguments for splitting data where indexes within a specific index list.

Return type:

DataBundle | None

Returns:

DataBundle or None – DataBundle containing rows where indexes are within a specific index list or None if inplace=True.

See also

DataBundle.select_where_entry_isin

Select rows from data where column entries are in a specific value list.

DataBundle.select_where_all_true

Select rows from data where all entries in mask are True.

DataBundle.select_where_all_false

Select rows from data where all entries in mask are False.

Notes

For more information see split_by_index()

Examples

Select without overwriting the old data.

>>> db_selected = db.select_where_index_isin([0, 2, 4])

Select with overwriting the old data.

>>> db.select_where_index_isin(index=[0, 2, 4], inplace=True)
>>> df_selected = db.data
split_by_boolean_false(do_mask=True, **kwargs)[source]

Split data by rows where all column entries in mask are False.

Parameters:
  • do_mask (bool, default: True) – If True also do selection on mask.

  • **kwargs (Any) – Additional keyword-arguments for splitting data where mask is False.

Return type:

tuple[DataBundle, DataBundle]

Returns:

tuple – First DataBundle including rows where all column entries in mask are False. Second DataBundle including rows where all column entries in mask are True.

See also

DataBundle.split_by_boolean_false

Split data by rows where all entries in mask are True.

DataBundle.split_by_column_entries

Split data by rows where column entries are in a specific value list.

DataBundle.split_by_index

Split data by rows within specific index list.

Notes

For more information see split_by_boolean_false()

Examples

Split DataBundle.

>>> db_false, db_true = db.split_by_boolean_false()
split_by_boolean_true(do_mask=True, **kwargs)[source]

Split data by rows where all column entries in mask are True.

Parameters:
  • do_mask (bool, default: True) – If True also do selection on mask.

  • **kwargs (Any) – Additional keyword-arguments for splitting data where mask is False.

Return type:

tuple[DataBundle, DataBundle]

Returns:

tuple – First DataBundle including rows where all column entries in mask are True. Second DataBundle including rows where all column entries in mask are False.

See also

DataBundle.split_by_boolean_false

Split data by rows where all entries in mask are False.

DataBundle.split_by_column_entries

Split data by rows where column entries are in a specific value list.

DataBundle.split_by_index

Split data by rows within specific index list.

Notes

For more information see split_by_boolean_true()

Examples

Split DataBundle.

>>> db_true, db_false = db.split_by_boolean_true()
split_by_column_entries(selection, do_mask=True, **kwargs)[source]

Split data by rows where column entries are in a specific value list.

Parameters:
  • selection (dict) – Keys: Column names in data. Values: Specific value list.

  • do_mask (bool, default: True) – If True also do selection on mask.

  • **kwargs (Any) – Additional keyword-arguments for splitting data by column entries.

Return type:

tuple[DataBundle, DataBundle]

Returns:

tuple – First DataBundle including rows where column entries are in a specific value list. Second DataBundle including rows where column entries are not in a specific value list.

See also

DataBundle.split_by_index

Split data by rows within specific index list.

DataBundle.split_by_boolean_true

Split data by rows where all entries in mask are True.

DataBundle.split_by_boolean_false

Split data by rows where all entries in mask are False.

Notes

For more information see split_by_column_entries()

Examples

Split DataBundle.

>>> db_isin, db_isnotin = db.split_by_column_entries(
...     selection={("c1", "B1"): [26, 41]},
... )
split_by_index(index, do_mask=True, **kwargs)[source]

Split data by rows within specific index list.

Parameters:
  • index (list of int) – Specific index list.

  • do_mask (bool, default: True) – If True also do selection on mask.

  • **kwargs (Any) – Additional keyword-arguments for splitting data by index.

Return type:

tuple[DataBundle, DataBundle]

Returns:

tuple – First DataBundle including rows within specific index list. Second DataBundle including rows outside specific index list.

See also

DataBundle.split_by_column_entries

Select columns from data with specific values.

DataBundle.split_by_boolean_true

Split data by rows where all entries in mask are True.

DataBundle.split_by_boolean_false

Split data by rows where all entries in mask are False.

Notes

For more information see split_by_index()

Examples

Split DataBundle.

>>> db_isin, db_isnotin = db.split_by_index([0, 2, 4])
stack_h(other, datasets=('data', 'mask'), inplace=False, **kwargs)[source]

Stack multiple DataBundle’s horizontally.

Parameters:
  • other (DataBundle or Sequence of DataBundle) – List of other DataBundle to stack horizontally.

  • datasets (str or Sequence of str, default: [data, mask]) – List of datasets to be stacked.

  • inplace (bool, default: False) – If True overwrite datasets in DataBundle else return a copy of DataBundle with stacked datasets.

  • **kwargs (Any) – Additional keyword-arguments for stacking DataFrames horizontally.

Return type:

DataBundle | None

Returns:

DataBundle or None – Horizontally stacked DataBundle or None if inplace=True.

See also

DataBundle.stack_v

Stack multiple DataBundle’s vertically.

Notes

  • This is only working with pd.DataFrames, not with iterables of pd.DataFrames!

  • The DataFrames in the DataBundle may have different data columns!

Examples

>>> db = db1.stack_h(db2, datasets=["data", "mask"])
stack_v(other, datasets=('data', 'mask'), inplace=False, **kwargs)[source]

Stack multiple DataBundle’s vertically.

Parameters:
  • other (DataBundle or Sequence of DataBundle) – List of other DataBundle to stack vertically.

  • datasets (str or Sequence of str, default: (data, mask)) – List of datasets to be stacked.

  • inplace (bool, default: False) – If True overwrite datasets in DataBundle else return a copy of DataBundle with stacked datasets.

  • **kwargs (Any) – Additional keyword-arguments for stacking DataFrames vertically.

Return type:

DataBundle | None

Returns:

DataBundle or None – Vertically stacked DataBundle or None if “inplace=True”.

See also

DataBundle.stack_h

Stack multiple DataBundle’s horizontally.

Notes

  • This is only working with pd.DataFrames, not with iterables of pd.DataFrames!

  • The DataFrames in the DataBundle have to have the same data columns!

Examples

>>> db = db1.stack_v(db2, datasets=["data", "mask"])
unique(**kwargs)[source]

Get unique values of data.

Parameters:

**kwargs (Any) – Additional keyword-arguments for getting unique values.

Return type:

dict[str | tuple[str, str], dict[Any, int]]

Returns:

dict – Dictionary with unique values.

Notes

For more information see unique()

Examples

>>> db.unique(columns=("c1", "B1"))
validate_datetime(imodel=None, **kwargs)[source]

Validate datetime information in data.

Parameters:
  • imodel (str, optional) – Name of the MFD/CDM data model.

  • **kwargs (Any) – Additional keyword-arguments for validating datetime.

Return type:

DataFrame

Returns:

pd.DataFrame – DataFrame containing True and False values for each index in data. True: All datetime information in data row are valid. False: At least one datetime information in data row is invalid.

See also

DataBundle.validate_id

Validate station id information in data.

DataBundle.correct_datetime

Correct datetime information in data.

DataBundle.correct_pt

Correct platform type information in data.

Notes

For more information see validate_datetime()

Examples

>>> val_dt = db.validate_datetime()
validate_id(imodel=None, **kwargs)[source]

Validate station id information in data.

Parameters:
  • imodel (str, optional) – Name of the MFD/CDM data model.

  • **kwargs (Any) – Additional keyword-arguments for validating station id.

Return type:

DataFrame

Returns:

pd.DataFrame – DataFrame containing True and False values for each index in data. True: All station ID information in data row are valid. False: At least one station ID information in data row is invalid.

See also

DataBundle.validate_datetime

Validate datetime information in data.

DataBundle.correct_pt

Correct platform type information in data.

DataBundle.correct_datetime

Correct datetime information in data.

Notes

For more information see validate_id()

Examples

>>> val_dt = db.validate_id()
write(dtypes=None, parse_dates=None, encoding=None, mode=None, **kwargs)[source]

Write data on disk.

Parameters:
  • dtypes (dict, optional) – Data types of data.

  • parse_dates (list or bool, optional) – Information how to parse dates on data.

  • encoding (str, optional) – The encoding of the input file. Overrides the value in the imodel schema file.

  • mode ({data, tables}, optional) – Data mode.

  • **kwargs (Any) – Additional keword-arguments for writing data in disk.

See also

write_data

Write MDF data and validation mask to disk.

write_tables

Write CDM tables to disk.

read

Read original marine-meteorological data as well as MDF data or CDM tables from disk.

read_data

Read MDF data and validation mask from disk.

read_mdf

Read original marine-meteorological data from disk.

Return type:

None

Notes

If mode is “data” write data using write_data(). If mode is “tables” write data using write_tables().

Examples

>>> db.write()
read_tables : Read CDM tables from disk.

cdm_reader_mapper.core.reader module

Common Data Model (CDM) DataBundle class.

cdm_reader_mapper.core.reader.read(source, mode='mdf', **kwargs)[source]

Read either original marine-meteorological data or MDF data or CDM tables from disk.

Parameters:
  • source (str) – Source of the input data.

  • mode (str, {mdf, data, tables}, default: mdf) –

    Read data mode:

    • “mdf” to read original marine-meteorological data from disk and convert them to MDF data

    • “data” to read MDF data from disk

    • “tables” to read CDM tables from disk. Map MDF data to CDM tables with DataBundle.map_model().

  • **kwargs (Any) – Additional keyword-arguments passed to reader function.

Return type:

DataBundle

Returns:

DataBundle – Containing read data as pd.DataFrame or Iterable of pd.DataFrames.

See also

read_mdf

Read original marine-meteorological data from disk.

read_data

Read MDF data and validation mask from disk.

read_tables

Read CDM tables from disk.

write

Write either MDF data or CDM tables on disk.

write_data

Write MDF data and validation mask to disk.

write_tables

Write CDM tables to disk.

Notes

kwargs are the keyword arguments for the specific mode reader.

cdm_reader_mapper.core.writer module

Common Data Model (CDM) DataBundle class.

cdm_reader_mapper.core.writer.write(data, mode='data', **kwargs)[source]

Write either MDF data or CDM tables on disk.

Parameters:
  • data (pandas.DataFrame or Iterable[pd.DataFrame]) – Data to export.

  • mode (str, {data, tables}, default: data) –

    Write data mode:

    • “data” to write MDF data to disk

    • “tables” to write CDM tables to disk. Map MDF data to CDM tables with DataBundle.map_model().

  • **kwargs (Any) – Additional key-word arguments used to write data on disk.

See also

write_data

Write MDF data and validation mask to disk.

write_tables

Write CDM tables to disk.

read

Read either original marine-meteorological data or MDF data or CDM tables from disk.

read_mdf

Read original marine-meteorological data from disk.

read_data

Read MDF data and validation mask from disk.

read_tables

Read CDM tables from disk.

Return type:

None

Notes

kwargs are the keyword arguments for the specific mode reader.