Changelog

2.5.0 (unreleased)

Contributor to this version: Ludwig Lierhammer (@ludwiglierhammer)

Announcements

  • cdm_reader_mapper now drops support for Python 3.10 (#419)

  • cdm_reader_mapper now uses cruft <https://cruft.github.io/cruft/> and the Ouranosinc cookiecutter template <https://github.com/Ouranosinc/cookiecutter-pypackage> (#369, #419)

New features and enhancements

  • The documentation now uses the furo theme for Sphinx (#419)

Breaking changes

  • Development dependencies (“dev”, “docs”) are now installed via the new dependency-groups conventions (PEP 735) (#419)

  • prek is now the suggested pre-commit runner (installed by default via pip install –group dev) (#419)

Internal changes

  • Updated the project template and boilerplate code to address configuration issues and benefit from new workflows/conventions (#419):

    • A new workflow has been added to automatically accept minor/patch updates to GitHub Actions and Python deps coming from Dependabot.

    • tox.ini` has migrated to tox.toml (new standard).

    • pyproject.toml and tox.toml now use [dependency-groups] to manage non-end-user dependency lists.

    • The Makefile recipes are much cleaner and now manage some dependency installation calls.

    • Various dependency updates.

  • The numpydoc linting tool has been added to the linting checks, and the pre.commit configurations (#419)

  • The mypy type checking has been added to the pre-commit configurations (#368, #419)

  • Documentation is now build without any warning messages (#419)

  • readthedocs.yaml: set fail_on_warnings to “true” (#419)

  • all modules are moved from “cdm_reader_mapper” to “src/cdm_reader_mapper” (#419)

  • CHANGES.rst has been renamed to CHANGELOG.rst, see suggestion from keepachangelog v.1.1.0 specifications. (#419).

  • rename class cdm_mapper.utils.mapping_funcitons.mapping_functions to cdm_mapper.utils.mapping_funcitons.MappingFunctions (#419)

  • remove helper class core._utilities._DataBundle and integrate it in core.databundle.DataBundle (#419)

  • make use of pathlib.Path instead of os.path (#419)

  • use consistently parameter “imodel” instead of “data_model” and “correction_method” instead of “fix_method” in metmetpy modules (#419)

2.4.1 (2016-04-16)

Contributor to this version: Ludwig Lierhammer (@ludwiglierhammer)

New features and enhancements

  • mdf_mapper / cdm_mapper: add new project CMEMS for drifting iridium buoy data (#405)

  • mdf_mapper / cdm_mapper: new parameter “separator” to define filename separator while reading and writing files (#414)

Breaking changes

  • cdm_mapper: update element names in MAROB CDM mapping tables (#393)

  • cdm_mapper.util.mapping_functions: change default MAROB datetime string format to “%Y-%m-%dT%H:%M:%S” (#393)

  • cdm_mapper: keep pd.NA value and do not convert them to strings (#414)

  • test_data: load parquet files instead of csv files (#410, #414)

  • mdf_mapper / cdm_mapper: default file name extension is “pq” while reading and writing files (#414)

Bug fixes

  • duplicates: do not change data types when updating quality flags and history description (#408)

  • mdf_reader: decode data to “utf-8” to avoid misleading file encoding (#414)

Internal changes

  • cdm_mapper.util.mapping_functions: delete function convert_to_decimal (#393)

2.4.0 (2026-04-01)

Contributors to this version: Ludwig Lierhammer (@ludwiglierhammer) and Jan Marius Willruth (@JanWillruth)

New features and enhancements

  • cdm_mapper.utils.mapping_functions: Add function gdac_pressure in anticipation of moving conversion steps to the mapper in the future. (#350)

  • mdf_reader/cdm_mapper: optionally, convert data types to strings when reading and writing data from/to disk (#401)

  • mdf_reader/cdm_mapper: optionally, convert data types from strings when reading and writing data from/to disk (#401)

Breaking changes

  • mdf_reader: Update and rename GDAC variable names (schemas/gdac) and code tables (codes/gdac) to align with current standards. (#341, #350)

  • cdm_mapper: (#350)

    • Update and rename GDAC variable names in tables/gdac.

    • Fix gdac_latitude and gdac_longitude (utils/mapping_functions.py) not being used in observations.json

  • mdf_reader/cdm_mapper: use parquet as default instead of csv when reading and writing data from/to disk (#401)

  • cdm_mapper: do not convert data types to strings while mapping to the CDM (#398, #401)

  • cdm_mapper: set default decimal_places from 0 to 1 for location_accuracy, report_time_accuracy, station_speed and station_course (#401)

Bug fixes

  • cdm_mapper.mapper.map_model: write data columns to df._attrs instead of df.attrs to avoid crashing class methods (#390, #391)

  • cdm_mapper.utils.mapping_functions: Change method_b in mapping_function.py to work with both str and int. (#350)

  • cdm_mapper.map_models: write columns directly as an attribute to result to avoid crashing further DataFrame methods (#394, #397)

2.3.0 (2026-03-12)

Contributors to this version: Ludwig Lierhammer (@ludwiglierhammer) and Jan Marius Willruth (@JanWillruth)

New features and enhancements

  • mdf_reader.read_data now supports chunking (#360)

  • read and write both parquet and feather files including new parameter data_format (#353, #363):

    • mdf_reader.read_data,

    • mdf_reader.write_data

    • cdm_mapper.read_tables

    • cdm_mapper.write_tables

  • introduce ParquetStreamReader to replace pd.parsers.io.TextfileReader (#8, #348)

  • cdm_reader.map_model now supports both pd.DataFrame and ParquetStreamReader as output (#348)

  • common.replace_columns now supports both pd.DataFrame and ParquetStreamReader as output (#348)

  • cdm_mapper.utils.mapping_functions: new mapping function convert_to_decimal (#370)

  • test_data: add MAROB test data (#370)

  • mdf_reader.read_data: new parameter “delimiter” (#370)

  • cdm_mapper.map_model’s output now has attribute “attrs” where columns are stored (#379)

  • ParquetStreamReader now support item assignment (#383)

  • ParquetStreamReader now works with both list and tuple as input data (#383)

Breaking changes

  • DataBundle.stack_v and DataBundle.stack_h only support pd.DataFrames as input, otherwise raises an ValueError (#360)

  • set default for extension from psv to specified data_format (#363):

    • cdm_mapper.read_tables

    • cdm_mapper.write_tables

  • set default for extension from csv to specified data_format in mdf_reader.write_data (#363)

  • mdf_reader.read_data: save dtypes in return DataBundle as pd.Series not dict (#363)

  • remove common.pandas_TextParser_hdlr (#8, #348)

  • cdm_reader_mapper now raises errors instead of logging them (#348)

  • DataBundle now converts all iterables of pd.DataFrame/pd.Series to ParquetStreamReader when initialized (#348)

  • all main functions in common.select now return a tuple of 4 (selected values, rejected values, original indexes of selected values, original indexes of rejected values) (#348)

  • move ParquetStreamReader and all corresponding methods to common.iterables to handle chunking outside of mdf_reader/cdm_mapper/core/metmetpy (#349, #348)

  • cdm_mapper.read_tables: if “suffix” is None no suffix is selected instead of the wildcard “*” (#379)

  • ParquetStreamReader.empty now is a property not a class method (#379)

  • cdm_mapper.utils.mapping_functions.string_add does no longer have parameters zfill_col and zfill (#383)

Bug fixes

  • replace “ICOADS-30-” with “ICOADS-300-” in icoads_r300 mapping tables (#385, #386)

Internal changes

  • re-work internal structure for more readability and better performance (#360)

  • use pre-defined Literal constants in cdm_reader_mapper.properties (#363)

  • mdf_reader.utils.utilities.read_csv: parameter columns to column_names (#363)

  • introduce post-processing decorator that handles both pd.DataFrame and ParquetStreamReader (#348)

  • cdm_mapper.mapper._map_data_model now returns a tuple of DataFrame and columns (#379)

  • delete unused function cdm_mapper.utils.mapping_functions.marob_location_quality (#383)

  • delete unreachable code snippets (#383)

  • mainly increase test coverage (#365, #383)

2.2.1 (2026-01-23)

Contributors to this version: Ludwig Lierhammer (@ludwiglierhammer)

Bug fixes

  • cdm_reader_mapper.cdm_mapper: set indexes to input data indexed when setting default values (#356)

2.2.0 (2026-01-23)

Contributors to this version: Ludwig Lierhammer (@ludwiglierhammer) and Joseph Siddons (@jtsiddons)

Announcements

This release adds support for Python 3.14 (#339)

New features and enhancements

  • new parameter in function map_model (#327)

    • drop_duplicates: If True remove duplicated rows (default: True)

    • drop_missing_obs: If True remove observation rows without a valid observation_value (default: True)

  • new Pub47 testdata (test_data[“test_pub47”]) (#327)

Breaking changes

  • cdm_reader_mapper.cdm_mapper: rename map_and_convert to helper function _map_and_convert (#343)

  • replace logging.error with raise error statements (#345)

Internal changes

  • implement map_model test for Pub47 data (#310, #327)

  • rename test data class from test_data to TestData (#327)

  • update .gitignore (#324)

  • update and add docstrings for multiple functions (#324)

  • cdm_reader_mapper.cdm_mapper: update mapping functions for more readability (#324)

  • cdm_reader_mapper.cdm_mapper: introduce some helper functions (#324)

  • add more unit tests (#311, #324)

  • cdm_reader_mapper.cdm_mapper: split map_and_convert into multiple helper functions (#333, #343)

  • exclude python files in tests from pre-commit codespell hook (#345)

  • replace many os functions with pathlib.Path (#345)

  • re-work mdf_reader (#334, #345)

    • remove reader.MDFFileReader class

    • remove utils.configurator module

    • remove both utils.decoder and mdf_reader.utils.converter modules

    • introduce utils.parser module: bunch of functions to parse input data into MDF data

    • introduce utils.convert_and_decode: make converter and decoder functions more modular

    • make utils.validator module more modular

    • utils.filereader.FileReader uses utils.parser function for parsing

    • move many helper function to utils.utilities

    • serialize schemas.schemas module

  • add type hints and docstrings to mdf_reader (#345)

  • add unit tests for mdf_reader module to testing suite (#345)

Bug fixes

  • add Pub47 mapping code tables (observing_frequency and vessel_type) (#308, #327)

  • observation tables are not empty anymore after mapping Pub47 raw data to the CDM (#309, #327)

2.1.1 (2025-10-21)

Contributors to this version: Ludwig Lierhammer (@ludwiglierhammer), Joseph Siddons (@jtsiddons) and Jan Marius Willruth (@JanWillruth)

New features and enhancements

  • add encoding optional argument to cdm_reader_mapper.read_mdf and cdm_reader_mapper.read_data which overrides default value set by model schema if set (#268, #273).

  • cdm_reader_mapper.mdf_reader: Added preprocessing function to convert air pressure (PPPP) in IMMT format (#287)

  • cdm_reader_mapper.cdm_mapper: Added mapping functions for IMMT datetime, latitude, and longitude conversions (#287)

  • cdm_reader_mapper.cdm_mapper: New mapping function datetime_imma_d701 for icoads_r300_d701 (#288, #295)

  • cdm_reader_mapper.cdm_mapper: New mapping function datetime_imma1_to_utc for mapping local midday to UTC (#288, #295)

Breaking changes

  • cdm_reader_mapper: Replace “gcc” with “gdac” (#287)

  • cdm_reader_mapper: Update gdac schemas to adhere to IMMT-5 documentation (#287)

  • cdm_reader_mapper: combine icoads_r300_d701_type1 and icoads_r300_d701_type1 test and result data to icoads_r300_d701 (#288, #295)

  • cdm_reader_mapper.cdm_mapper: combine icoads_r300_d701_type1 and icoads_r300_d701_type1 mapping tables to icoads_r300_d701 (#288, #295)

  • cdm_reader_mapper.read: Allow strings as input for cdm_subset (#281)

  • cdm_reader_mapper.cdm_mapper: Remove timestamps and/or previous history information in column history (#281)

  • cdm_reader_mapper.DataBundle: Set empty pd.DataFrames as defaults for both data and mask (#281)

  • cdm_reader_mapper.mdf_reader: read drifter numbers as strings not as integers with C-RAID (#281)

Internal changes

  • tests: create test data result hidden directory (#291)

  • cdm_reader_mapper.mdf_reader: update and tidy-up ICOADS mapping tables (#281)

  • timezonefinde is pinned below v7.0.0 (#281)

Bug fixes

  • cdm_reader_mapper.write_data: fix doubling of output file name (#273)

  • cdm_reader_mapper.cdm_mapper.mapping_functions: datetime conversion now ignores unformatable dates (#277, #278)

  • README: fixing hyperlink (#279, #280)

  • tests: raise OSError on checksum mismatch (#291)

2.1.0 (2025-04-08)

Contributors to this version: Ludwig Lierhammer (@ludwiglierhammer) and Joseph Siddons (@jtsiddons)

New features and enhancements

  • implement both wrapper functions read and write that call the appropriate function based on mode argument (#238):

    • mode == “mdf”; calls cdm_reader_mapper.read_mdf

    • mode == “data”; calls cdm_reader_mapper.read_data or cdm_reader_mapper.write_data

    • mode == “tables”; calls cdm_reader_mapper.read_tables or cdm_reader_mapper.write_tables

  • optionally, call cdm_reader_mapper.read_tables with either source file or source directory path (#238).

  • apply attribute to DataBundle.data if attribute is nor defined in DataBundle (#248).

  • apply pandas functions directly to DataBundle.data by calling DataBundle.<pandas-func> (#248).

  • make DataBundle support item assignment for DataBundle.data (#248).

  • optionally, apply selections to DataBundle.mask in DataBundle.select_* functions (#248).

  • cdm_reader.reader.read_tables: optionally, set null_label (#242)

  • new method function: DataBundle.select_where_all_false (#242)

  • new method functions: DataBundle.split_* which split a DataBundle into two new DataBundles containing data selected and rejected after user-defined selection criteria (#242)

    • DataBundle.split_by_boolean_true

    • DataBundle.split_by_boolean_false

    • DataBundle.split_by_column_entries

    • DataBundle.split_by_index

  • implement pandas indexer like iloc for not chunked data (#242)

Internal changes

  • cdm_reader_mapper.common.select: restructure, simplify and summarize functions (#242)

  • split DataBundle class into main class (cdm_reader_mapper.core._utilities) and method function class (cdm_reader_mapper.core.databundle) (#242)

Breaking changes

  • remove property tables from DataBundle object. Instead, DataBundle.map_model overwrites .DataBundle.data (#238).

  • set default overwrite values from True to False that is consistent with pandas inplace argument and rename overwrite to inplace (#238, #248).

  • inplace returns None that is consistent with pandas (#242)

  • DataBundle method functions return a DataBundle instead of a pandas.DataFrame (#248).

  • DataBundle.select_* functions write only selected entries to DataBundle.data and do not take other list entries from common.select_* function returns into account (#248).

  • select functions do not reset indexes by default (#242)

  • rename DataBundle.select_* functions:

    • DataBundle.select_true -> DataBundle.select_where_all_boolean

    • DataBundle.select_from_list -> DataBundle.select_where_entry_isin

    • DataBundle.select_from_index -> DataBundle.select_where_index_isin

  • rename cdm_reader_mapper.common.select_* functions and make them returning a tuple of selected and rejected data after user-defined selection criteria (#242):

    • select_true -> split_by_boolean_true

    • select_from_list -> split_by_column_entries

    • select_from_index -> spit_by_index

Bug fixes

  • cdm_reder_mapper.metmetpy: set deck keys from ??? to d??? in icoads json files which makes values accessible again (#238).

  • cdm_reder_mapper.metmetpy: set imma1 to icoads and immt to gcc in icoads/gcc json files which makes properties accessible again (#238).

  • DataBundle.copy function now makes a real deepcopy of DataBundle object (#248).

  • correct key index->section for self.df.attrs in open_netcdf (#252)

  • cdm_reader_mapper.map_model: return null_label if conversion fails (#242)

  • keep indexes during duplicate check (#242)

2.0.1 (2025-02-25)

Contributors to this version: Ludwig Lierhammer (@ludwiglierhammer) and Joseph Siddons (@jtsiddons)

Announcements

This release drops support for Python 3.9 and adds support for Python 3.13 (#228, #229)

New features and enhancements

  • add environment.yml file (#229)

  • cdm_reader_mapper now separates the optional dependencies into dev and docs recipes (#232).

    • $ python -m pip install cdm_reader_mapper # Install minimum dependency version

    • $ python -m pip install cdm_reader_mapper[dev] # Install optional development dependencies in addition

    • $ python -m pip install cdm_reader_mapper[docs] # Install optional dependencies for the documentation in addition

    • $ python -m pip install cdm_reader_mapper[all] # Install all the above for complete dependency version

Internal changes

  • GitHub workflow for testing_suite now uses uv for environment management, replacing micromamba (#228)

  • rename ci/requirements to CI and tidy up requirements/dependencies (#229)

2.0.0 (2025-02-14)

Contributors to this version: Ludwig Lierhammer (@ludwiglierhammer) and Joseph Siddons (@jtsiddons)

New features and enhancements

  • New core DataBundle object including callable cdm_mapper, metmemtpy and operations methods (#84, #188, #197)

  • Update readthedocs documentation (#191, #197)

  • new function: write_data to write MDF data and validation mask according to write_tables for writing CDM tables (#201)

  • new function: read_data to read MDF data and validation mask according to read_tables for reading CDM tables (#201)

  • new property: DataBundle.encoding (#222)

  • add overwrite option to some DataBundel method functions (#224)

Breaking changes

  • cdm_mapper: map_model returns pandas.DataFrame instead of CDM dictionary (#189)

  • cdm_mapper: rename function cdm_to_ascii to write_tables (#182, #185)

  • cdm_mapper: update parameter names and list of functions read_tables and write_tables (#185)

  • main cdm_mapper, mdf_reader and duplicates modules are directly callable from cdm_reader_mapper (#188)

  • new list of imported submodules: [map_model, cdm_tables, read_tables, write_tables, duplicate_check and read_mdf] (#188)

  • removed list of imported submodules: [cdm_mapper, common, mdf_reader, metmetpy, operations] (#188)

  • remove imported submodules from cdm_mapper, mdf_reader (#188)

  • read_tables: returning DataBundle object (#188)

  • read_tables: resulting dataframe always includes multi-indexed columns (#188)

  • duplicates is now a direct submodule of cdm_reader_mapper (#188)

  • import read function from mdf_reader.read as read_mdf (#188)

  • read_mdf: returning DataBundle object (#188)

  • read_mdf: remove parameter out_path to dump attribute information on disk (#201)

  • move function open_code_table from common.json_dict to cdm_mapper.codes.codes (#221)

  • operations to common (#224)

  • cdm_mapper: rename table_writer to writer and table_reader to reader (#224)

  • mdf_reader: rename write to writer and read to reader (#224)

  • metmetpy: gather correction functions to correct module and validation functions to validate module (#224)

  • DataBundle: remove properties selected, deselected, tables_dup_flagged and tables_dups_removed (#224)

Internal changes

  • cdm_mapper: dtype conversion from write_tables to new submodule _conversions of map_model (#189)

  • cdm_mapper: rename mappings to _mapping_functions (#189)

  • cdm_mapper: mapping functions from mapper to new submodule _mappings (#189)

  • cdm_mapper: save utility functions from table_reader.py and table_writer.py to _utilities.py (#185)

  • reduce complexity of several functions (#25, #200):

    • mdf_reader.read.read

    • mdf_reader.validate.validate

    • mfd_reader.utils.decoders.signed_overpunch

    • cdm_mapper._mappings._mapping

    • metmetmpy.station_id.validate.validate

  • split mdf_reader.utils.auxiliary into mdf_reader.utils.filereader, mdf_reader.utils.configurator and mdf_reader.utils.utilities (#25, #200)

  • simplify cdm_mapper.read_tables function (#192)

  • mdf_reader: Refactored Configurator class, Configurator.open_pandas method, to handle looping through rows (#208, #210)

  • mdf_reader: Refactored Configurator class, Configurator.open_data method, to avoid creating a pre-validation missing_value mask (#216)

  • mdf_reader: move validate to utils.validators (#216)

  • mdf_reader: no need for multi-column key codes (e.g. ("core", "VS")) (#221)

  • mdf_reader.utils.validator: simplify function code_validation (#221)

  • cdm_mapper.codes.common: convert range-key properties to list (#221)

  • testing_suite: new chunksize test with icoads_r300_d721 (#222)

  • mdf_reader, cdm_nmapper: use model-depending encoding while writing data on disk (#222)

  • code restructuring (#224)

  • remove unused functions and methods (#224)

Bug fixes

  • Solve SettingWithCopyWarning (#151, #184)

  • mdf_reader: utils.converters.decode returns values not only None (#214)

  • mdf_reader: solving misleading reading due to German “umlauts”(#212, #214, #222)

1.0.2 (2024-11-13)

Contributors to this version: Ludwig Lierhammer (@ludwiglierhammer)

Announcements

  • New PyPi Classifiers:

    • Development Status :: 5 - Production/Stable

    • Development Status :: Intended Audience :: Science/Research

    • License :: OSI Approved :: Apache Software License

    • Operating System :: OS Independent

1.0.1 (2024-11-08)

Contributors to this version: Ludwig Lierhammer (@ludwiglierhammer)

Announcements

  • set package version to v1.0.1

1.0.0 (2024-11-08)

Contributors to this version: Ludwig Lierhammer (@ludwiglierhammer)

Announcements

  • Final version used for GLAMOD marine processing release 7.0

Bug fixes

  • cdm_mapper: Two reports that describe each other as best duplicates are not flagged as duplicates (DupDetect) (#149)

  • cdm_mapper: Reindex only if null values available (DupDetect) (#153)

0.4.3 (2024-10-23)

Contributors to this version: Ludwig Lierhammer (@ludwiglierhammer)

Announcements

  • First release on pypi (#17)

  • First release on zenodo (#18)

0.4.2 (2024-10-23)

Contributors to this version: Ludwig Lierhammer (@ludwiglierhammer)

Announcements

  • Testing first release on pypi (#17)

  • Testing first release on zenodo (#18)

0.4.1 (2024-10-23)

Contributors to this version: Ludwig Lierhammer (@ludwiglierhammer)

Announcements

  • Testing first release on pypi (#17)

  • Testing first release on zenodo (#18)

0.4.0 (2024-10-23)

Contributors to this version: Ludwig Lierhammer (@ludwiglierhammer) and Joseph Siddons (@jtsiddons)

Announcements

  • Now under Apache v2.0 license (#69)

New features and enhancements

  • common.getting_files.load_file: optionally, load data within data reference syntax (#41)

  • common.getting_files.load_file: optionally, clear cache directory (#45)

  • reworked readthedocs documentation for gathered cdm_reader_mapper package (#19, #83)

  • mdf_reader: new validation function for datetime objects (#89)

  • mdf_reader: select time period with new arguments year_init ad year_end (#98)

  • cdm_mapper: duplicate check using recordlinkage (#81)

  • mdf_reader.read: optionally, set left and right time bounds (year_init and year_end) (#11, #97)

  • mdf_reader.read: optionally, set both external schema and code table paths and external schema file (#47, #111)

  • cdm_mapper: Change both columns history and report_quality during duplicate_check (#112)

  • cdm_mapper: optionally, set column names to be ignored while duplicate check (#115)

  • cdm_mapper: optionally, set offset values for duplicate_check (#119)

  • cdm_mapper: optionally, set column entries to be ignored while duplicate_check (#119)

  • cdm_mapper: add both column names station_speed and station_course to default duplicate check list (#119)

  • cdm_mapper: optionally, re-index data in ascending order according to the number of nulls in each row (#119)

Breaking changes

  • set chunksize from 10000 to 3 in testing suite (#35)

  • cdm_mapper: read header column location_quality from (c1, LZ) and set fill_value to 0 (#36, #37)

  • cdm_mapper: set default value of header column report_quality to 2 (#36, #37)

  • reading C-RAID data: set decimal places according to input file data precision (#60)

  • always convert data types of both int and float in schemas into default data types (#59, #60)

  • cdm_mapper.map_model: call function without input parameter data_atts (#66, #67)

  • decimal_places information is moved from mdf_reader.schema to cdm_mapper.tables; decimal_places in user-given schemas will be ignored (#66, #67)

  • cdm_mapper does not need any attribute information from mdf_reader (#66, #67)

  • cdm_mapper: map ICOADS wind direction data (361 -> 0; 362 -> np.nan) (#82)

  • cdm_mapper: set fill_value to UNKNOWN for C-RAID’s primary_station_id (#93)

  • cdm_mapper: map C-RAID quality flags to CDM quality flags (#94)

  • mdf_reader: summarize schema and code tables (#11, #97)

  • mdf_reader: rename c_raid to craid, gcc_immt to gcc and imma1 to icoads (#11, #97)

  • cdm_mapper: summarize tables and code tables (#11, #97)

  • cdm_mapper: rename c_raid to craid and gcc_mapping to gcc (#11, #97)

  • metmetpy: rename immt to gcc and imma to icoads (#11, #97)

  • cdm_mapper.map_model: use standardized imodel_name as <data_model>_<release>_<deck> (e.g. icoads_r300_d701) (#11, #97)

  • mdf_reader.read: use standardized imodel_name as <data_model>_<release>_<deck> (e.g. icoads_r300_d701) (#11, #97)

  • mdf_reader: (core, VS) set column_type to key for all ICOADS decks (#11, #97)

  • cdm_mapper: rename pub47_noc mapping to pub47 (#102)

  • Note by each function call: rename data_model into imodel e.g. imodel=icoads_r300_d704 (#103)

  • cdm_mapper.map_model: call with (data, imodel=imodel) (#103)

  • mdf_reader.read: call with (source, imodel=imodel) (#103)

  • Re-order arguments to mdf_reader.validate, and create argument for ext_table_path (#105)

  • operations: delete corrections module (#104)

  • cdm_mapper: duplicate check is available for header table only (#115)

  • cdm_mapper: set report_quality to 1 for bad duplicates (#115)

  • cdm_mapper: set default primary_station_id to 4 for C-RAID mapping (#117, #121)

  • renamed some element names in icoads_r300_d730 schema for consistency (InsName to InstName, InsPlace to InstPlace, InsLand to InstLand, No_data_entry to NumArchiveSet) (#110)

Internal changes

Bug fixes

  • indexing working with user-given chunksize (#35)

  • fix reading of custom schema in mdf_reader.read (#40)

  • ensure format schema field for delimited files is passed correctly, avoiding "...Please specify either format or field_layout in your header schema..." error (#40)

  • there is a loss of data precision due to data type conversion. Hence, use default data types of both int and float (#59, #60)

  • reading C-RAID data: adjust datetime formats to read dates into MDFFileReader (#60)

  • ensure external code tables are used when using an external schema in mdf_reader.read (#105)

  • update readme and example Jupyter notebooks to #103 (#110)

  • restructure CLIWOC_datamodel Jupyter notebook to add an example of data model construction (#110)

  • remove create_data_model.ipynb example Jupyter notebook (#110)

0.3.0 (2024-05-17)

Contributors to this version: Ludwig Lierhammer (@ludwiglierhammer) and Joseph Siddons (@jtsiddons)

New features and enhancements

  • mdf_reader: read C-RAID netCDF buoy data (#13, #24, #28)

  • adding both GCC IMMT and C-RAID netCDF data to test_data (#24, #28)

  • cdm_mapper: adding C-RAID mapping and code tables (#13, #28)

  • cdm_mapper: add load_tables to __init.py__ (#32)

Breaking changes

Internal changes

  • do not differentiate between tuple and single column names (#24)

  • metmetpy: Do not raise errors if validate_datetime, correct_datetime, correct_pt and/or validate_id do not find any entries (#24)

  • get rid of warnings (#9, #27)

  • adding python 3.12 to testing suite (#29)

  • set time out for testing suite to 10 minutes (#29)

Bug fixes

  • cdm_mapper: set debugging logger into if statement (#24)

  • cdm_mapper: do not use code table qc_flag with report_id (#24)

  • metmetpy: fixing ICOADS 30000 NRT functions for pandas>=2.2.0 (#31)

  • cdm_mapper.read_tables: if table not available return empty pd.DataFrame (#32)

0.2.0 (2024-03-15)

Contributors to this version: Ludwig Lierhammer (@ludwiglierhammer) and Joseph Siddons (@jtsiddons)

Breaking changes

  • move converters and decoders from common to mdf_reader/utils (#3)

  • delete redundant functions from cdm_reader_mapper.common

  • cdm_reader_mapper: import common (__init__.py)

  • remove unused modules from metmetpy

  • cdm_reader_mapper.mdf_reader split data_models into code_tables and schema

  • logging: Allow for use of log file (#6)

  • cannot use as command-line tool anymore (#22)

  • outsource input and result data to cdm-testdata (#16, #21)

Internal changes

  • adding tests to cdm_reader_mapper testing suite (#12, #2, #20, #22)

  • adding testing result data (#4)

  • use slugify instead of unidecde for licening reasons

  • remove pip install instruction (#2)

  • HISTORY.rst has been renamed CHANGES.rst, to follow xclim-like conventions (#7).

  • speed up mapping functions with swifter (#4)

  • mdf_reader: adding auxiliary functions and classes (#4)

  • mdf_reader: read tables line-by-line (#20)

Bug fixes

  • Fixed an issue with missing conda dependencies in the cdm_reader_mapper documentation (#14)

0.1.0 (2024-01-16)

Contributors to this version: Ludwig Lierhammer (@ludwiglierhammer)

Breaking changes

Internal changes

  • make use of pre-commit

  • prepare for pandas>=2.1.0

  • use setuptools_scm for automatic updating of version numbers