cdm_reader_mapper.map_model

Contents

cdm_reader_mapper.map_model#

cdm_reader_mapper.map_model(data, imodel, cdm_subset=None, codes_subset=None, cdm_complete=True, drop_missing_obs=True, drop_duplicates=True, log_level='INFO')[source]#

Map a pandas DataFrame to the CDM header and observational tables.

Parameters:
  • data (pandas.DataFrame or Iterable[pd.DataFrame]) – input data to map.

  • imodel (str) – A specific mapping from generic data model to CDM, like map a SID-DCK from IMMA1’s core and attachments to CDM in a specific way. e.g. icoads_r300_d704

  • cdm_subset (str or list, optional) – subset of CDM model tables to map. Defaults to the full set of CDM tables defined for the imodel.

  • codes_subset (str or list, optional) – subset of code mapping tables to map. Default to the full set of code mapping tables defined for the imodel.

  • cdm_complete (bool) – If True map entire CDM tables list. Default: True

  • drop_missing_obs (bool) – If True Drop observations without a valid observation value (e.g. no air_temperature value). Default: True

  • drop_duplicates (bool) – If True drop duplicated rows. Default: True

  • log_level (str) – level of logging information to save. Default: INFO.

Return type:

DataFrame | ParquetStreamReader

Returns:

cdm_tables (pandas.DataFrame) – DataFrame with MultiIndex columns (cdm_table, column_name).