Changelog#
2.4.1 (2016-04-16)#
Contributor to this version: Ludwig Lierhammer (@ludwiglierhammer)
New features and enhancements#
Breaking changes#
cdm_mapper: update element names in MAROB CDM mapping tables (#393)
cdm_mapper.util.mapping_functions: change default MAROB datetime string format to “%Y-%m-%dT%H:%M:%S” (#393)
cdm_mapper: keep pd.NA value and do not convert them to strings (#414)
test_data: load parquet files instead of csv files (#410, #414)
mdf_mapper / cdm_mapper: default file name extension is “pq” while reading and writing files (#414)
Bug fixes#
Internal changes#
cdm_mapper.util.mapping_functions: delete function convert_to_decimal (#393)
2.4.0 (2026-04-01)#
Contributors to this version: Ludwig Lierhammer (@ludwiglierhammer) and Jan Marius Willruth (@JanWillruth)
New features and enhancements#
cdm_mapper.utils.mapping_functions: Add function gdac_pressure in anticipation of moving conversion steps to the mapper in the future. (#350)
mdf_reader/cdm_mapper: optionally, convert data types to strings when reading and writing data from/to disk (#401)
mdf_reader/cdm_mapper: optionally, convert data types from strings when reading and writing data from/to disk (#401)
Breaking changes#
mdf_reader: Update and rename GDAC variable names (schemas/gdac) and code tables (codes/gdac) to align with current standards. (#341, #350)
cdm_mapper: (#350)
Update and rename GDAC variable names in tables/gdac.
Fix gdac_latitude and gdac_longitude (utils/mapping_functions.py) not being used in observations.json
mdf_reader/cdm_mapper: use parquet as default instead of csv when reading and writing data from/to disk (:pul:`401`)
cdm_mapper: do not convert data types to strings while mapping to the CDM (#398, #401)
cdm_mapper: set default decimal_places from 0 to 1 for location_accuracy, report_time_accuracy, station_speed and station_course (#401)
Bug fixes#
cdm_mapper.mapper.map_model: write data columns to df._attrs instead of df.attrs to avoid crashing class methods (#390, #391)
cdm_mapper.utils.mapping_functions: Change method_b in mapping_function.py to work with both str and int. (#350)
cdm_mapper.map_models: write columns directly as an attribute to result to avoid crashing further DataFrame methods (#394, #397)
2.3.0 (2026-03-12)#
Contributors to this version: Ludwig Lierhammer (@ludwiglierhammer) and Jan Marius Willruth (@JanWillruth)
New features and enhancements#
mdf_reader.read_datanow supports chunking (#360)read and write both parquet and feather files including new parameter data_format (#353, #363):
mdf_reader.read_data,
mdf_reader.write_data
cdm_mapper.read_tables
cdm_mapper.write_tables
introduce ParquetStreamReader to replace pd.parsers.io.TextfileReader (#8, #348)
cdm_reader.map_modelnow supports both pd.DataFrame and ParquetStreamReader as output (#348)common.replace_columnsnow supports both pd.DataFrame and ParquetStreamReader as output (#348)cdm_mapper.utils.mapping_functions: new mapping function convert_to_decimal (#370)test_data: add MAROB test data (#370)mdf_reader.read_data: new parameter “delimiter” (#370)cdm_mapper.map_model’s output now has attribute “attrs” where columns are stored (#379)ParquetStreamReader now support item assignment (#383)
ParquetStreamReader now works with both list and tuple as input data (#383)
Breaking changes#
DataBundle.stack_vandDataBundle.stack_honly support pd.DataFrames as input, otherwise raises an ValueError (#360)set default for extension from psv to specified data_format (#363):
cdm_mapper.read_tables
cdm_mapper.write_tables
set default for extension from csv to specified data_format in mdf_reader.write_data (#363)
mdf_reader.read_data: save dtypes in return DataBundle as pd.Series not dict (#363)
cdm_reader_mappernow raises errors instead of logging them (#348)DataBundlenow converts all iterables of pd.DataFrame/pd.Series to ParquetStreamReader when initialized (#348)all main functions in common.select now return a tuple of 4 (selected values, rejected values, original indexes of selected values, original indexes of rejected values) (#348)
move ParquetStreamReader and all corresponding methods to common.iterables to handle chunking outside of mdf_reader/cdm_mapper/core/metmetpy (#349, #348)
cdm_mapper.read_tables: if “suffix” is None no suffix is selected instead of the wildcard “*” (#379)
ParquetStreamReader.empty now is a property not a class method (#379)
cdm_mapper.utils.mapping_functions.string_add does no longer have parameters zfill_col and zfill (#383)
Bug fixes#
Internal changes#
re-work internal structure for more readability and better performance (#360)
use pre-defined Literal constants in cdm_reader_mapper.properties (#363)
mdf_reader.utils.utilities.read_csv: parameter columns to column_names (#363)
introduce post-processing decorator that handles both pd.DataFrame and ParquetStreamReader (#348)
cdm_mapper.mapper._map_data_model now returns a tuple of DataFrame and columns (#379)
delete unused function cdm_mapper.utils.mapping_functions.marob_location_quality (#383)
delete unreachable code snippets (#383)
2.2.1 (2026-01-23)#
Contributors to this version: Ludwig Lierhammer (@ludwiglierhammer)
Bug fixes#
cdm_reader_mapper.cdm_mapper: set indexes to input data indexed when setting default values (#356)
2.2.0 (2026-01-23)#
Contributors to this version: Ludwig Lierhammer (@ludwiglierhammer) and Joseph Siddons (@jtsiddons)
Announcements#
This release adds support for Python 3.14 (#339)
New features and enhancements#
Breaking changes#
Internal changes#
rename test data class from test_data to TestData (#327)
update .gitignore (#324)
update and add docstrings for multiple functions (#324)
cdm_reader_mapper.cdm_mapper: update mapping functions for more readability (#324)cdm_reader_mapper.cdm_mapper: introduce some helper functions (#324)cdm_reader_mapper.cdm_mapper: split map_and_convert into multiple helper functions (#333, #343)replace many os functions with pathlib.Path (#345)
re-work mdf_reader (#334, #345)
remove reader.MDFFileReader class
remove utils.configurator module
remove both utils.decoder and mdf_reader.utils.converter modules
introduce utils.parser module: bunch of functions to parse input data into MDF data
introduce utils.convert_and_decode: make converter and decoder functions more modular
make utils.validator module more modular
utils.filereader.FileReader uses utils.parser function for parsing
move many helper function to utils.utilities
serialize schemas.schemas module
add type hints and docstrings to mdf_reader (#345)
add unit tests for mdf_reader module to testing suite (#345)
Bug fixes#
2.1.1 (2025-10-21)#
Contributors to this version: Ludwig Lierhammer (@ludwiglierhammer), Joseph Siddons (@jtsiddons) and Jan Marius Willruth (@JanWillruth)
New features and enhancements#
add
encodingoptional argument tocdm_reader_mapper.read_mdfandcdm_reader_mapper.read_datawhich overrides default value set by model schema if set (#268, #273).cdm_reader_mapper.mdf_reader: Added preprocessing function to convert air pressure (PPPP) in IMMT format (#287)cdm_reader_mapper.cdm_mapper: Added mapping functions for IMMT datetime, latitude, and longitude conversions (#287)cdm_reader_mapper.cdm_mapper: New mapping function datetime_imma_d701 for icoads_r300_d701 (#288, #295)cdm_reader_mapper.cdm_mapper: New mapping function datetime_imma1_to_utc for mapping local midday to UTC (#288, #295)
License and Legal#
Breaking changes#
cdm_reader_mapper: Replace “gcc” with “gdac” (#287)cdm_reader_mapper: Update gdac schemas to adhere to IMMT-5 documentation (#287)cdm_reader_mapper: combine icoads_r300_d701_type1 and icoads_r300_d701_type1 test and result data to icoads_r300_d701 (#288, #295)cdm_reader_mapper.cdm_mapper: combine icoads_r300_d701_type1 and icoads_r300_d701_type1 mapping tables to icoads_r300_d701 (#288, #295)cdm_reader_mapper.read: Allow strings as input for cdm_subset (#281)cdm_reader_mapper.cdm_mapper: Remove timestamps and/or previous history information in column history (#281)cdm_reader_mapper.DataBundle: Set empty pd.DataFrames as defaults for both data and mask (#281)cdm_reader_mapper.mdf_reader: read drifter numbers as strings not as integers with C-RAID (#281)
Internal changes#
Bug fixes#
2.1.0 (2025-04-08)#
Contributors to this version: Ludwig Lierhammer (@ludwiglierhammer) and Joseph Siddons (@jtsiddons)
New features and enhancements#
implement both wrapper functions
readandwritethat call the appropriate function based onmodeargument (#238):mode == “mdf”; calls
cdm_reader_mapper.read_mdfmode == “data”; calls
cdm_reader_mapper.read_dataorcdm_reader_mapper.write_datamode == “tables”; calls
cdm_reader_mapper.read_tablesorcdm_reader_mapper.write_tables
optionally, call
cdm_reader_mapper.read_tableswith either source file or source directory path (#238).apply attribute to
DataBundle.dataif attribute is nor defined inDataBundle(#248).apply pandas functions directly to
DataBundle.databy callingDataBundle.<pandas-func>(#248).make
DataBundlesupport item assignment forDataBundle.data(#248).optionally, apply selections to
DataBundle.maskinDataBundle.select_*functions (#248).cdm_reader.reader.read_tables: optionally, set null_label (#242)new method function:
DataBundle.select_where_all_false(#242)new method functions:
DataBundle.split_*which split a DataBundle into two new DataBundles containing data selected and rejected after user-defined selection criteria (#242)DataBundle.split_by_boolean_trueDataBundle.split_by_boolean_falseDataBundle.split_by_column_entriesDataBundle.split_by_index
implement pandas indexer like
ilocfor not chunked data (#242)
Internal changes#
Breaking changes#
remove property
tablesfromDataBundleobject. Instead,DataBundle.map_modeloverwrites.DataBundle.data(#238).set default
overwritevalues fromTruetoFalsethat is consistent with pandasinplaceargument and renameoverwritetoinplace(#238, #248).inplacereturnsNonethat is consistent with pandas (#242)DataBundlemethod functions return aDataBundleinstead of apandas.DataFrame(#248).DataBundle.select_*functions write only selected entries toDataBundle.dataand do not take other list entries fromcommon.select_*function returns into account (#248).select functions do not reset indexes by default (#242)
rename
DataBundle.select_*functions:DataBundle.select_true->DataBundle.select_where_all_booleanDataBundle.select_from_list->DataBundle.select_where_entry_isinDataBundle.select_from_index->DataBundle.select_where_index_isin
rename
cdm_reader_mapper.common.select_*functions and make them returning a tuple of selected and rejected data after user-defined selection criteria (#242):select_true->split_by_boolean_trueselect_from_list->split_by_column_entriesselect_from_index->spit_by_index
Bug fixes#
cdm_reder_mapper.metmetpy: set deck keys from???tod???in icoads json files which makes values accessible again (#238).cdm_reder_mapper.metmetpy: setimma1toicoadsandimmttogccin icoads/gcc json files which makes properties accessible again (#238).DataBundle.copyfunction now makes a real deepcopy ofDataBundleobject (#248).correct key index->section for self.df.attrs in open_netcdf (#252)
cdm_reader_mapper.map_model: return null_label if conversion fails (#242)keep indexes during duplicate check (#242)
2.0.1 (2025-02-25)#
Contributors to this version: Ludwig Lierhammer (@ludwiglierhammer) and Joseph Siddons (@jtsiddons)
Announcements#
This release drops support for Python 3.9 and adds support for Python 3.13 (#228, #229)
New features and enhancements#
add environment.yml file (#229)
cdm_reader_mapper now separates the optional dependencies into dev and docs recipes (#232).
$ python -m pip install cdm_reader_mapper # Install minimum dependency version
$ python -m pip install cdm_reader_mapper[dev] # Install optional development dependencies in addition
$ python -m pip install cdm_reader_mapper[docs] # Install optional dependencies for the documentation in addition
$ python -m pip install cdm_reader_mapper[all] # Install all the above for complete dependency version
Internal changes#
2.0.0 (2025-02-14)#
Contributors to this version: Ludwig Lierhammer (@ludwiglierhammer) and Joseph Siddons (@jtsiddons)
New features and enhancements#
New core
DataBundleobject including callablecdm_mapper,metmemtpyandoperationsmethods (#84, #188, #197)new function:
write_datato write MDF data and validation mask according towrite_tablesfor writing CDM tables (#201)new function:
read_datato read MDF data and validation mask according toread_tablesfor reading CDM tables (#201)new property: DataBundle.encoding (#222)
add overwrite option to some DataBundel method functions (#224)
Breaking changes#
cdm_mapper:map_modelreturns pandas.DataFrame instead of CDM dictionary (#189)cdm_mapper: rename functioncdm_to_asciitowrite_tables(#182, #185)cdm_mapper: update parameter names and list of functionsread_tablesandwrite_tables(#185)main
cdm_mapper,mdf_readerandduplicatesmodules are directly callable fromcdm_reader_mapper(#188)new list of imported submodules: [
map_model,cdm_tables,read_tables,write_tables,duplicate_checkandread_mdf] (#188)removed list of imported submodules: [
cdm_mapper,common,mdf_reader,metmetpy,operations] (#188)remove imported submodules from
cdm_mapper,mdf_reader(#188)read_tables: returningDataBundleobject (#188)read_tables: resulting dataframe always includes multi-indexed columns (#188)duplicatesis now a direct submodule ofcdm_reader_mapper(#188)import
readfunction frommdf_reader.readasread_mdf(#188)read_mdf: returningDataBundleobject (#188)read_mdf: remove parameterout_pathto dump attribute information on disk (#201)move function
open_code_tablefromcommon.json_dicttocdm_mapper.codes.codes(:pull:``221`)operationstocommon(#224)cdm_mapper: rename table_writer to writer and table_reader to reader (#224)mdf_reader: rename write to writer and read to reader (#224)metmetpy: gather correction functions to correct module and validation functions to validate module (#224)DataBundle: remove properties selected, deselected, tables_dup_flagged and tables_dups_removed (#224)
Internal changes#
cdm_mapper: dtype conversion fromwrite_tablesto new submodule_conversionsofmap_model(#189)cdm_mapper: renamemappingsto_mapping_functions(#189)cdm_mapper: mapping functions frommapperto new submodule_mappings(#189)cdm_mapper: save utility functions fromtable_reader.pyandtable_writer.pyto_utilities.py(#185)reduce complexity of several functions (#25, #200):
mdf_reader.read.readmdf_reader.validate.validatemfd_reader.utils.decoders.signed_overpunchcdm_mapper._mappings._mappingmetmetmpy.station_id.validate.validate
split
mdf_reader.utils.auxiliaryintomdf_reader.utils.filereader,mdf_reader.utils.configuratorandmdf_reader.utils.utilities(#25, #200)simplify
cdm_mapper.read_tablesfunction (#192)mdf_reader: RefactoredConfiguratorclass,Configurator.open_pandasmethod, to handle looping through rows (#208, #210)mdf_reader: RefactoredConfiguratorclass,Configurator.open_datamethod, to avoid creating a pre-validation missing_value mask (#216)mdf_reader: movevalidatetoutils.validators(#216)mdf_reader: no need for multi-column key codes (e.g.("core", "VS")) (#221)mdf_reader.utils.validator: simplify functioncode_validation(#221)cdm_mapper.codes.common: convert range-key properties to list (#221)testing_suite: new chunksize test with icoads_r300_d721 (#222)mdf_reader,cdm_nmapper: use model-depending encoding while writing data on disk (#222)code restructuring (#224)
remove unused functions and methods (#224)
Bug fixes#
1.0.2 (2024-11-13)#
Contributors to this version: Ludwig Lierhammer (@ludwiglierhammer)
Announcements#
New PyPi Classifiers:
Development Status :: 5 - Production/Stable
Development Status :: Intended Audience :: Science/Research
License :: OSI Approved :: Apache Software License
Operating System :: OS Independent
1.0.1 (2024-11-08)#
Contributors to this version: Ludwig Lierhammer (@ludwiglierhammer)
Announcements#
set package version to v1.0.1
1.0.0 (2024-11-08)#
Contributors to this version: Ludwig Lierhammer (@ludwiglierhammer)
Announcements#
Final version used for GLAMOD marine processing release 7.0
Bug fixes#
0.4.3 (2024-10-23)#
Contributors to this version: Ludwig Lierhammer (@ludwiglierhammer)
Announcements#
0.4.2 (2024-10-23)#
Contributors to this version: Ludwig Lierhammer (@ludwiglierhammer)
Announcements#
0.4.1 (2024-10-23)#
Contributors to this version: Ludwig Lierhammer (@ludwiglierhammer)
Announcements#
0.4.0 (2024-10-23)#
Contributors to this version: Ludwig Lierhammer (@ludwiglierhammer) and Joseph Siddons (@jtsiddons)
Announcements#
Now under Apache v2.0 license (#69)
New features and enhancements#
common.getting_files.load_file: optionally, load data within data reference syntax (#41)common.getting_files.load_file: optionally, clear cache directory (#45)reworked readthedocs documentation for gathered
cdm_reader_mapperpackage (#19, #83)mdf_reader: new validation function for datetime objects (#89)mdf_reader: select time period with new argumentsyear_initadyear_end(#98)cdm_mapper: duplicate check usingrecordlinkage(#81)mdf_reader.read: optionally, set left and right time bounds (year_initandyear_end) (#11, #97)mdf_reader.read: optionally, set both external schema and code table paths and external schema file (#47, #111)cdm_mapper: Change both columns history and report_quality during duplicate_check (#112)cdm_mapper: optionally, set column names to be ignored while duplicate check (#115)cdm_mapper: optionally, set offset values for duplicate_check (#119)cdm_mapper: optionally, set column entries to be ignored while duplicate_check (#119)cdm_mapper: add both column namesstation_speedandstation_courseto default duplicate check list (#119)cdm_mapper: optionally, re-index data in ascending order according to the number of nulls in each row (#119)
Breaking changes#
set chunksize from 10000 to 3 in testing suite (#35)
cdm_mapper: read header columnlocation_qualityfrom(c1, LZ)and set fill_value to0(#36, #37)cdm_mapper: set default value of header columnreport_qualityto2(#36, #37)reading C-RAID data: set decimal places according to input file data precision (#60)
always convert data types of both
intandfloatin schemas into default data types (#59, #60)cdm_mapper.map_model: call function without input parameterdata_atts(#66, #67)decimal_placesinformation is moved frommdf_reader.schematocdm_mapper.tables;decimal_placesin user-given schemas will be ignored (#66, #67)cdm_mapperdoes not need any attribute information frommdf_reader(#66, #67)cdm_mapper: map ICOADS wind direction data (361->0;362->np.nan) (#82)cdm_mapper: set fill_value toUNKNOWNfor C-RAID’sprimary_station_id(#93)cdm_mapper: map C-RAID quality flags to CDM quality flags (#94)mdf_reader: renamec_raidtocraid,gcc_immttogccandimma1toicoads(#11, #97)cdm_mapper: renamec_raidtocraidandgcc_mappingtogcc(#11, #97)cdm_mapper.map_model: use standardized imodel_name as <data_model>_<release>_<deck> (e.g. icoads_r300_d701) (#11, #97)mdf_reader.read: use standardized imodel_name as <data_model>_<release>_<deck> (e.g. icoads_r300_d701) (#11, #97)mdf_reader: (core,VS) set column_type tokeyfor all ICOADS decks (#11, #97)cdm_mapper: rename pub47_noc mapping to pub47 (#102)Note by each function call: rename
data_modelintoimodele.g. imodel=icoads_r300_d704 (#103)cdm_mapper.map_model: call with (data, imodel=imodel) (#103)mdf_reader.read: call with (source, imodel=imodel) (#103)Re-order arguments to
mdf_reader.validate, and create argument forext_table_path(#105)operations: delete corrections module (#104)cdm_mapper: duplicate check is available for header table only (#115)cdm_mapper: set report_quality to1for bad duplicates (#115)cdm_mapper: set default primary_station_id to4for C-RAID mapping (#117, #121)renamed some element names in
icoads_r300_d730schema for consistency (InsNametoInstName,InsPlacetoInstPlace,InsLandtoInstLand,No_data_entrytoNumArchiveSet) (#110)
Internal changes#
replace deprecated
datetime.datetime.utcnow()withdatetime.datetime.now(datetime.UTC)(see: python/cpython#103857) (#39, #43)make use of
cdm-testdatareleasev2024.06.07glamod/cdm-testdata (#44, #45)migration to
setup-micromamba: mamba-org/provision-with-micromamba (#48)update actions to use Node.js 20: https://docs.github.com/en/actions/using-workflows/workflow-syntax-for-github-actions#example-using-versioned-actions (#48)
mdf_reader.auxiliary.utils: rename variable for missing values tomissing_values(#56)add
pre-commithooks:codespell,pylintandvulture(#56)use
pytest.parametrizefor testing suite (#61)use
ast.literal_evalinstead ofeval(#64)cdm_mapper.mappings: usedatetimeto convertfloatinto hours and minutes.add FOSSA license scanning to github workflows (#80)
add
cdm_reader_mapperauthor list including ORCID iD’s (#38, #49)mdf_reader: replace empty strings with missing values (#89)metmetpy: use functionoverwrite_datain all platform type correction functions (#89)rename
data_modelintoimodel(#103)implement assertion tests for module operations (#104)
cdm_mapper: put settings for duplicate check in _duplicate_settings (#119)cdm_mapper: use pandas.apply function instead of for loops in duplicate_check (#119)adding some more duplicate checks to testing suite (#119)
cdm_mapper: re-adding conserderation of indexes of nan values during transformation (#125)
Bug fixes#
indexing working with user-given chunksize (#35)
fix reading of custom schema in
mdf_reader.read(#40)ensure
formatschema field for delimited files is passed correctly, avoiding"...Please specify either format or field_layout in your header schema..."error (#40)there is a loss of data precision due to data type conversion. Hence, use default data types of both
intandfloat(#59, #60)reading C-RAID data: adjust datetime formats to read dates into
MDFFileReader(#60)ensure external code tables are used when using an external schema in
mdf_reader.read(#105)restructure
CLIWOC_datamodelJupyter notebook to add an example of data model construction (#110)remove
create_data_model.ipynbexample Jupyter notebook (#110)
0.3.0 (2024-05-17)#
Contributors to this version: Ludwig Lierhammer (@ludwiglierhammer) and Joseph Siddons (@jtsiddons)
New features and enhancements#
Breaking changes#
Internal changes#
Bug fixes#
0.2.0 (2024-03-15)#
Contributors to this version: Ludwig Lierhammer (@ludwiglierhammer) and Joseph Siddons (@jtsiddons)
Breaking changes#
move converters and decoders from
commontomdf_reader/utils(#3)delete redundant functions from
cdm_reader_mapper.commoncdm_reader_mapper: import common (__init__.py)remove unused modules from
metmetpycdm_reader_mapper.mdf_readersplit data_models into code_tables and schemalogging: Allow for use of log file (#6)
cannot use as command-line tool anymore (#22)
Internal changes#
adding tests to cdm_reader_mapper testing suite (#12, #2, #20, #22)
adding testing result data (#4)
use slugify instead of unidecde for licening reasons
remove pip install instruction (#2)
HISTORY.rsthas been renamedCHANGES.rst, to follow xclim-like conventions (#7).speed up mapping functions with swifter (#4)
mdf_reader: adding auxiliary functions and classes (#4)mdf_reader: read tables line-by-line (#20)
Bug fixes#
Fixed an issue with missing
condadependencies in thecdm_reader_mapperdocumentation (#14)
0.1.0 (2024-01-16)#
Contributors to this version: Ludwig Lierhammer (@ludwiglierhammer)
Breaking changes#
combine mdf_reader , cdm-mapper, pandas_operations and metmetpy
optionally: use
cdm_reader_mapperas a command-line interface tool
Internal changes#
make use of
pre-commitprepare for
pandas>=2.1.0use
setuptools_scmfor automatic updating of version numbers