{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Mapping data from ICOADS deck 704 to the Common Data Model (CDM)\n",
"\n",
"Here we extract supplemental metadata from [ICOADSv3.0](https://icoads.noaa.gov/r3.html) stored in the [IMMA version 1](https://icoads.noaa.gov/e-doc/imma/R3.0-imma1.pdf) format. \n",
"We will then map this data (including the supplemental data) to the Common Data Model (CDM) format defined in the [CDM Documentation](https://github.com/glamod/common_data_model/blob/master/cdm_latest.pdf)..\n",
"\n",
"The supplementary data are mapped to the CDM using the [tables](https://github.com/glamod/cdm_reader_mapper/tree/main/cdm_reader_mapper/cdm_mapper/tables/icoads/r300/d704) and [codes](https://github.com/glamod/cdm_reader_mapper/tree/main/cdm_reader_mapper/cdm_mapper/codes/icoads/r300/d704) specific to deck 704. The generic ICOADS [tables](https://github.com/glamod/cdm_reader_mapper/tree/main/cdm_reader_mapper/cdm_mapper/tables/icoads) are used to map the common ICOADS data components."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We are analysing deck: `704`, the [US Marine Meteorological Journals Collection](https://icoads.noaa.gov/usmmj.html)"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"from __future__ import annotations\n",
"\n",
"import pandas as pd\n",
"\n",
"from cdm_reader_mapper import read_mdf, test_data"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We first read the supplemental data information from the `c99` imma format for a subset of the data (e.g. 1878/10). For this we need to use the `\"icoads_r300_d704\"` schema. The convention for schema names is: `\"format_version_deck\"`\n",
"\n",
"* format/data model: \"icoads\"\n",
"* version/release: \"r300\" (release 3.0.0)\n",
"* deck: \"d704\"\n",
"\n",
"In this notebook we load the icoads r3.0.0 deck 704 test file to use as an example."
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"WARNING:root:Unknown column_type 'object' for column '('c8', 'PUID')'\n",
"WARNING:root:Unknown column_type 'object' for column '('c95', 'ARCR')'\n",
"WARNING:root:Unknown column_type 'object' for column '('c96', 'ARCI')'\n",
"WARNING:root:Unknown column_type 'object' for column '('c97', 'ARCE')'\n",
"WARNING:root:Unknown column_type 'object' for column '('c99_sentinel', 'BLK')'\n",
"C:\\Users\\llierham\\mobaxterm\\github_ll\\cdm_reader_mapper\\src\\cdm_reader_mapper\\mdf_reader\\utils\\validators.py:240: FutureWarning: DataFrame.applymap has been deprecated. Use DataFrame.map instead.\n",
" to_bool = data[validated_columns].applymap(convert_str_boolean)\n",
"C:\\Users\\llierham\\mobaxterm\\github_ll\\cdm_reader_mapper\\src\\cdm_reader_mapper\\mdf_reader\\utils\\validators.py:241: FutureWarning: DataFrame.applymap has been deprecated. Use DataFrame.map instead.\n",
" false_mask = to_bool.applymap(_is_false)\n",
"C:\\Users\\llierham\\mobaxterm\\github_ll\\cdm_reader_mapper\\src\\cdm_reader_mapper\\mdf_reader\\utils\\validators.py:242: FutureWarning: DataFrame.applymap has been deprecated. Use DataFrame.map instead.\n",
" true_mask = to_bool.applymap(_is_true)\n"
]
}
],
"source": [
"schema = \"icoads_r300_d704\"\n",
"\n",
"data_file_path = test_data.test_icoads_r300_d704[\"source\"] # Load the example file from the cdm_reader_mapper test data\n",
"data_bundle = read_mdf(data_file_path, imodel=schema)\n",
"data_raw = data_bundle.data"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The data from the c99 column for this deck is separated in the following sub sections:\n",
"- c99_sentinal\n",
"- c99_journal\n",
"- c99_voyage\n",
"- c99_daily\n",
"- c99_data4\n",
"- c99_data5"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"
\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" ATTI | \n",
" ATTL | \n",
" BLK | \n",
"
\n",
" \n",
" \n",
" \n",
" | 0 | \n",
" 99 | \n",
" 0 | \n",
" None | \n",
"
\n",
" \n",
" | 1 | \n",
" 99 | \n",
" 0 | \n",
" None | \n",
"
\n",
" \n",
" | 2 | \n",
" 99 | \n",
" 0 | \n",
" None | \n",
"
\n",
" \n",
" | 3 | \n",
" 99 | \n",
" 0 | \n",
" None | \n",
"
\n",
" \n",
" | 4 | \n",
" 99 | \n",
" 0 | \n",
" None | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" ATTI ATTL BLK\n",
"0 99 0 None\n",
"1 99 0 None\n",
"2 99 0 None\n",
"3 99 0 None\n",
"4 99 0 None"
]
},
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"data_raw.c99_sentinel.head()"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" sentinel | \n",
" reel_no | \n",
" journal_no | \n",
" frame_no | \n",
" ship_name | \n",
" journal_ed | \n",
" rig | \n",
" ship_material | \n",
" vessel_type | \n",
" vessel_length | \n",
" vessel_beam | \n",
" commander | \n",
" country | \n",
" screw_paddle | \n",
" hold_depth | \n",
" tonnage | \n",
" baro_type | \n",
" baro_height | \n",
" baro_cdate | \n",
" baro_loc | \n",
" baro_units | \n",
" baro_cor | \n",
" thermo_mount | \n",
" SST_I | \n",
"
\n",
" \n",
" \n",
" \n",
" | 0 | \n",
" 1 | \n",
" 002 | \n",
" 0018 | \n",
" 0003 | \n",
" Panay | \n",
" 78 | \n",
" 01 | \n",
" 1 | \n",
" 1 | \n",
" 187 | \n",
" 37 | \n",
" S.P.Bray,Jr | \n",
" 01 | \n",
" 3 | \n",
" 23 | \n",
" 1190 | \n",
" 2 | \n",
" 14 | \n",
" None | \n",
" Bulkhead of cabin | \n",
" 1 | \n",
" - .102 | \n",
" 2 | \n",
" None | \n",
"
\n",
" \n",
" | 1 | \n",
" 1 | \n",
" 002 | \n",
" 0018 | \n",
" 0003 | \n",
" Panay | \n",
" 78 | \n",
" 01 | \n",
" 1 | \n",
" 1 | \n",
" 187 | \n",
" 37 | \n",
" S.P.Bray,Jr | \n",
" 01 | \n",
" 3 | \n",
" 23 | \n",
" 1190 | \n",
" 2 | \n",
" 14 | \n",
" None | \n",
" Bulkhead of cabin | \n",
" 1 | \n",
" - .102 | \n",
" 2 | \n",
" None | \n",
"
\n",
" \n",
" | 2 | \n",
" 1 | \n",
" 002 | \n",
" 0018 | \n",
" 0003 | \n",
" Panay | \n",
" 78 | \n",
" 01 | \n",
" 1 | \n",
" 1 | \n",
" 187 | \n",
" 37 | \n",
" S.P.Bray,Jr | \n",
" 01 | \n",
" 3 | \n",
" 23 | \n",
" 1190 | \n",
" 2 | \n",
" 14 | \n",
" None | \n",
" Bulkhead of cabin | \n",
" 1 | \n",
" - .102 | \n",
" 2 | \n",
" None | \n",
"
\n",
" \n",
" | 3 | \n",
" 1 | \n",
" 002 | \n",
" 0018 | \n",
" 0003 | \n",
" Panay | \n",
" 78 | \n",
" 01 | \n",
" 1 | \n",
" 1 | \n",
" 187 | \n",
" 37 | \n",
" S.P.Bray,Jr | \n",
" 01 | \n",
" 3 | \n",
" 23 | \n",
" 1190 | \n",
" 2 | \n",
" 14 | \n",
" None | \n",
" Bulkhead of cabin | \n",
" 1 | \n",
" - .102 | \n",
" 2 | \n",
" None | \n",
"
\n",
" \n",
" | 4 | \n",
" 1 | \n",
" 002 | \n",
" 0018 | \n",
" 0003 | \n",
" Panay | \n",
" 78 | \n",
" 01 | \n",
" 1 | \n",
" 1 | \n",
" 187 | \n",
" 37 | \n",
" S.P.Bray,Jr | \n",
" 01 | \n",
" 3 | \n",
" 23 | \n",
" 1190 | \n",
" 2 | \n",
" 14 | \n",
" None | \n",
" Bulkhead of cabin | \n",
" 1 | \n",
" - .102 | \n",
" 2 | \n",
" None | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" sentinel reel_no journal_no frame_no ship_name journal_ed rig ship_material \\\n",
"0 1 002 0018 0003 Panay 78 01 1 \n",
"1 1 002 0018 0003 Panay 78 01 1 \n",
"2 1 002 0018 0003 Panay 78 01 1 \n",
"3 1 002 0018 0003 Panay 78 01 1 \n",
"4 1 002 0018 0003 Panay 78 01 1 \n",
"\n",
" vessel_type vessel_length vessel_beam commander country screw_paddle \\\n",
"0 1 187 37 S.P.Bray,Jr 01 3 \n",
"1 1 187 37 S.P.Bray,Jr 01 3 \n",
"2 1 187 37 S.P.Bray,Jr 01 3 \n",
"3 1 187 37 S.P.Bray,Jr 01 3 \n",
"4 1 187 37 S.P.Bray,Jr 01 3 \n",
"\n",
" hold_depth tonnage baro_type baro_height baro_cdate baro_loc \\\n",
"0 23 1190 2 14 None Bulkhead of cabin \n",
"1 23 1190 2 14 None Bulkhead of cabin \n",
"2 23 1190 2 14 None Bulkhead of cabin \n",
"3 23 1190 2 14 None Bulkhead of cabin \n",
"4 23 1190 2 14 None Bulkhead of cabin \n",
"\n",
" baro_units baro_cor thermo_mount SST_I \n",
"0 1 - .102 2 None \n",
"1 1 - .102 2 None \n",
"2 1 - .102 2 None \n",
"3 1 - .102 2 None \n",
"4 1 - .102 2 None "
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"pd.options.display.max_columns = None\n",
"data_raw.c99_journal.head()"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" sentinel | \n",
" reel_no | \n",
" journal_no | \n",
" frame_start | \n",
" from_city | \n",
" to_city | \n",
"
\n",
" \n",
" \n",
" \n",
" | 0 | \n",
" 2 | \n",
" 002 | \n",
" 0018 | \n",
" 0014 | \n",
" Boston | \n",
" Rio de Janeiro | \n",
"
\n",
" \n",
" | 1 | \n",
" 2 | \n",
" 002 | \n",
" 0018 | \n",
" 0014 | \n",
" Boston | \n",
" Rio de Janeiro | \n",
"
\n",
" \n",
" | 2 | \n",
" 2 | \n",
" 002 | \n",
" 0018 | \n",
" 0014 | \n",
" Boston | \n",
" Rio de Janeiro | \n",
"
\n",
" \n",
" | 3 | \n",
" 2 | \n",
" 002 | \n",
" 0018 | \n",
" 0014 | \n",
" Boston | \n",
" Rio de Janeiro | \n",
"
\n",
" \n",
" | 4 | \n",
" 2 | \n",
" 002 | \n",
" 0018 | \n",
" 0014 | \n",
" Boston | \n",
" Rio de Janeiro | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" sentinel reel_no journal_no frame_start from_city to_city\n",
"0 2 002 0018 0014 Boston Rio de Janeiro\n",
"1 2 002 0018 0014 Boston Rio de Janeiro\n",
"2 2 002 0018 0014 Boston Rio de Janeiro\n",
"3 2 002 0018 0014 Boston Rio de Janeiro\n",
"4 2 002 0018 0014 Boston Rio de Janeiro"
]
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"data_raw.c99_voyage.head()"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" sentinel | \n",
" reel_no | \n",
" journal_no | \n",
" frame_start | \n",
" frame | \n",
" year | \n",
" month | \n",
" day | \n",
" distance | \n",
" lat_deg_an | \n",
" lat_min_an | \n",
" lat_hemis_an | \n",
" lon_deg_an | \n",
" lon_min_an | \n",
" lon_hemis_an | \n",
" lat_deg_on | \n",
" lat_min_on | \n",
" lat_hemis_on | \n",
" lon_deg_of | \n",
" lon_min_of | \n",
" lon_hemis_of | \n",
" current_speed | \n",
" current_direction | \n",
"
\n",
" \n",
" \n",
" \n",
" | 0 | \n",
" 3 | \n",
" 002 | \n",
" 0018 | \n",
" 0014 | \n",
" 0015 | \n",
" 1878 | \n",
" 10 | \n",
" 20 | \n",
" NaN | \n",
" <NA> | \n",
" <NA> | \n",
" None | \n",
" <NA> | \n",
" <NA> | \n",
" None | \n",
" 42 | \n",
" 20 | \n",
" N | \n",
" 66 | \n",
" 30 | \n",
" W | \n",
" 0.1 | \n",
" E | \n",
"
\n",
" \n",
" | 1 | \n",
" 3 | \n",
" 002 | \n",
" 0018 | \n",
" 0014 | \n",
" 0015 | \n",
" 1878 | \n",
" 10 | \n",
" 20 | \n",
" NaN | \n",
" <NA> | \n",
" <NA> | \n",
" None | \n",
" <NA> | \n",
" <NA> | \n",
" None | \n",
" 42 | \n",
" 20 | \n",
" N | \n",
" 66 | \n",
" 30 | \n",
" W | \n",
" 0.1 | \n",
" E | \n",
"
\n",
" \n",
" | 2 | \n",
" 3 | \n",
" 002 | \n",
" 0018 | \n",
" 0014 | \n",
" 0015 | \n",
" 1878 | \n",
" 10 | \n",
" 20 | \n",
" NaN | \n",
" <NA> | \n",
" <NA> | \n",
" None | \n",
" <NA> | \n",
" <NA> | \n",
" None | \n",
" 42 | \n",
" 20 | \n",
" N | \n",
" 66 | \n",
" 30 | \n",
" W | \n",
" 0.1 | \n",
" E | \n",
"
\n",
" \n",
" | 3 | \n",
" 3 | \n",
" 002 | \n",
" 0018 | \n",
" 0014 | \n",
" 0015 | \n",
" 1878 | \n",
" 10 | \n",
" 20 | \n",
" NaN | \n",
" <NA> | \n",
" <NA> | \n",
" None | \n",
" <NA> | \n",
" <NA> | \n",
" None | \n",
" 42 | \n",
" 20 | \n",
" N | \n",
" 66 | \n",
" 30 | \n",
" W | \n",
" 0.1 | \n",
" E | \n",
"
\n",
" \n",
" | 4 | \n",
" 3 | \n",
" 002 | \n",
" 0018 | \n",
" 0014 | \n",
" 0015 | \n",
" 1878 | \n",
" 10 | \n",
" 20 | \n",
" NaN | \n",
" <NA> | \n",
" <NA> | \n",
" None | \n",
" <NA> | \n",
" <NA> | \n",
" None | \n",
" 42 | \n",
" 20 | \n",
" N | \n",
" 66 | \n",
" 30 | \n",
" W | \n",
" 0.1 | \n",
" E | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" sentinel reel_no journal_no frame_start frame year month day distance \\\n",
"0 3 002 0018 0014 0015 1878 10 20 NaN \n",
"1 3 002 0018 0014 0015 1878 10 20 NaN \n",
"2 3 002 0018 0014 0015 1878 10 20 NaN \n",
"3 3 002 0018 0014 0015 1878 10 20 NaN \n",
"4 3 002 0018 0014 0015 1878 10 20 NaN \n",
"\n",
" lat_deg_an lat_min_an lat_hemis_an lon_deg_an lon_min_an lon_hemis_an \\\n",
"0 None None \n",
"1 None None \n",
"2 None None \n",
"3 None None \n",
"4 None None \n",
"\n",
" lat_deg_on lat_min_on lat_hemis_on lon_deg_of lon_min_of lon_hemis_of \\\n",
"0 42 20 N 66 30 W \n",
"1 42 20 N 66 30 W \n",
"2 42 20 N 66 30 W \n",
"3 42 20 N 66 30 W \n",
"4 42 20 N 66 30 W \n",
"\n",
" current_speed current_direction \n",
"0 0.1 E \n",
"1 0.1 E \n",
"2 0.1 E \n",
"3 0.1 E \n",
"4 0.1 E "
]
},
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"data_raw.c99_daily.head()"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" sentinel | \n",
" reel_no | \n",
" journal_no | \n",
" frame_start | \n",
" frame | \n",
" year | \n",
" month | \n",
" day | \n",
" time_ind | \n",
" hour | \n",
" ship_speed | \n",
" compass_ind | \n",
" ship_course_compass | \n",
" compass_correction | \n",
" ship_course_true | \n",
" wind_dir_mag | \n",
" wind_dir_true | \n",
" wind_force | \n",
" barometer | \n",
" temp_ind | \n",
" attached_thermometer | \n",
" air_temperature | \n",
" wet_bulb_temperature | \n",
" sea_temperature | \n",
" present_weather | \n",
" clouds | \n",
" sky_clear | \n",
" sea_state | \n",
"
\n",
" \n",
" \n",
" \n",
" | 0 | \n",
" 4 | \n",
" 002 | \n",
" 0018 | \n",
" 0014 | \n",
" 0015 | \n",
" 1878 | \n",
" 10 | \n",
" 20 | \n",
" 1 | \n",
" 2 | \n",
" 8.5 | \n",
" None | \n",
" EXS | \n",
" <NA> | \n",
" None | \n",
" WSW | \n",
" None | \n",
" 06 | \n",
" 2960 | \n",
" 1 | \n",
" 5.8 | \n",
" NaN | \n",
" NaN | \n",
" NaN | \n",
" BOC | \n",
" CU | \n",
" 5 | \n",
" R | \n",
"
\n",
" \n",
" | 1 | \n",
" 4 | \n",
" 002 | \n",
" 0018 | \n",
" 0014 | \n",
" 0015 | \n",
" 1878 | \n",
" 10 | \n",
" 20 | \n",
" 1 | \n",
" 4 | \n",
" 8.5 | \n",
" None | \n",
" EXS | \n",
" <NA> | \n",
" None | \n",
" WSW | \n",
" None | \n",
" 06 | \n",
" 2960 | \n",
" 1 | \n",
" 5.6 | \n",
" NaN | \n",
" NaN | \n",
" NaN | \n",
" BOC | \n",
" SC | \n",
" 3 | \n",
" R | \n",
"
\n",
" \n",
" | 2 | \n",
" 4 | \n",
" 002 | \n",
" 0018 | \n",
" 0014 | \n",
" 0015 | \n",
" 1878 | \n",
" 10 | \n",
" 20 | \n",
" 1 | \n",
" 6 | \n",
" 8.5 | \n",
" None | \n",
" EXS | \n",
" <NA> | \n",
" None | \n",
" W | \n",
" None | \n",
" 06 | \n",
" 2962 | \n",
" 1 | \n",
" 5.6 | \n",
" 4.8 | \n",
" NaN | \n",
" 5.2 | \n",
" OCG | \n",
" SC | \n",
" 0 | \n",
" R | \n",
"
\n",
" \n",
" | 3 | \n",
" 4 | \n",
" 002 | \n",
" 0018 | \n",
" 0014 | \n",
" 0015 | \n",
" 1878 | \n",
" 10 | \n",
" 20 | \n",
" 1 | \n",
" 8 | \n",
" 8.0 | \n",
" None | \n",
" EXS | \n",
" <NA> | \n",
" None | \n",
" W | \n",
" None | \n",
" 06 | \n",
" 2964 | \n",
" 1 | \n",
" 5.6 | \n",
" 4.8 | \n",
" NaN | \n",
" 5.2 | \n",
" CG | \n",
" SC | \n",
" 0 | \n",
" R | \n",
"
\n",
" \n",
" | 4 | \n",
" 4 | \n",
" 002 | \n",
" 0018 | \n",
" 0014 | \n",
" 0015 | \n",
" 1878 | \n",
" 10 | \n",
" 20 | \n",
" 1 | \n",
" 10 | \n",
" 8.5 | \n",
" None | \n",
" EXS | \n",
" <NA> | \n",
" None | \n",
" W | \n",
" None | \n",
" 06 | \n",
" 2969 | \n",
" 1 | \n",
" 5.7 | \n",
" 4.8 | \n",
" NaN | \n",
" 5.0 | \n",
" BC | \n",
" SC | \n",
" 2 | \n",
" L | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" sentinel reel_no journal_no frame_start frame year month day time_ind \\\n",
"0 4 002 0018 0014 0015 1878 10 20 1 \n",
"1 4 002 0018 0014 0015 1878 10 20 1 \n",
"2 4 002 0018 0014 0015 1878 10 20 1 \n",
"3 4 002 0018 0014 0015 1878 10 20 1 \n",
"4 4 002 0018 0014 0015 1878 10 20 1 \n",
"\n",
" hour ship_speed compass_ind ship_course_compass compass_correction \\\n",
"0 2 8.5 None EXS \n",
"1 4 8.5 None EXS \n",
"2 6 8.5 None EXS \n",
"3 8 8.0 None EXS \n",
"4 10 8.5 None EXS \n",
"\n",
" ship_course_true wind_dir_mag wind_dir_true wind_force barometer temp_ind \\\n",
"0 None WSW None 06 2960 1 \n",
"1 None WSW None 06 2960 1 \n",
"2 None W None 06 2962 1 \n",
"3 None W None 06 2964 1 \n",
"4 None W None 06 2969 1 \n",
"\n",
" attached_thermometer air_temperature wet_bulb_temperature \\\n",
"0 5.8 NaN NaN \n",
"1 5.6 NaN NaN \n",
"2 5.6 4.8 NaN \n",
"3 5.6 4.8 NaN \n",
"4 5.7 4.8 NaN \n",
"\n",
" sea_temperature present_weather clouds sky_clear sea_state \n",
"0 NaN BOC CU 5 R \n",
"1 NaN BOC SC 3 R \n",
"2 5.2 OCG SC 0 R \n",
"3 5.2 CG SC 0 R \n",
"4 5.0 BC SC 2 L "
]
},
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"data_raw.c99_data4.head()"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" sentinel | \n",
" reel_no | \n",
" journal_no | \n",
" frame_start | \n",
" frame | \n",
" year | \n",
" month | \n",
" day | \n",
" time_ind | \n",
" hour | \n",
" ship_speed | \n",
" compass_ind | \n",
" ship_course_compass | \n",
" blank | \n",
" ship_course_true | \n",
" wind_dir_mag | \n",
" wind_dir_true | \n",
" wind_force | \n",
" barometer | \n",
" temp_ind | \n",
" attached_thermometer | \n",
" air_temperature | \n",
" wet_bulb_temperature | \n",
" sea_temperature | \n",
" present_weather | \n",
" clouds | \n",
" sky_clear | \n",
" sea_state | \n",
" compass_correction_ind | \n",
" compass_correction | \n",
" compass_correction_dir | \n",
"
\n",
" \n",
" \n",
" \n",
" | 0 | \n",
" None | \n",
" None | \n",
" None | \n",
" None | \n",
" None | \n",
" <NA> | \n",
" <NA> | \n",
" <NA> | \n",
" None | \n",
" <NA> | \n",
" NaN | \n",
" None | \n",
" None | \n",
" None | \n",
" None | \n",
" None | \n",
" None | \n",
" None | \n",
" None | \n",
" None | \n",
" NaN | \n",
" NaN | \n",
" NaN | \n",
" NaN | \n",
" None | \n",
" None | \n",
" <NA> | \n",
" None | \n",
" None | \n",
" NaN | \n",
" None | \n",
"
\n",
" \n",
" | 1 | \n",
" None | \n",
" None | \n",
" None | \n",
" None | \n",
" None | \n",
" <NA> | \n",
" <NA> | \n",
" <NA> | \n",
" None | \n",
" <NA> | \n",
" NaN | \n",
" None | \n",
" None | \n",
" None | \n",
" None | \n",
" None | \n",
" None | \n",
" None | \n",
" None | \n",
" None | \n",
" NaN | \n",
" NaN | \n",
" NaN | \n",
" NaN | \n",
" None | \n",
" None | \n",
" <NA> | \n",
" None | \n",
" None | \n",
" NaN | \n",
" None | \n",
"
\n",
" \n",
" | 2 | \n",
" None | \n",
" None | \n",
" None | \n",
" None | \n",
" None | \n",
" <NA> | \n",
" <NA> | \n",
" <NA> | \n",
" None | \n",
" <NA> | \n",
" NaN | \n",
" None | \n",
" None | \n",
" None | \n",
" None | \n",
" None | \n",
" None | \n",
" None | \n",
" None | \n",
" None | \n",
" NaN | \n",
" NaN | \n",
" NaN | \n",
" NaN | \n",
" None | \n",
" None | \n",
" <NA> | \n",
" None | \n",
" None | \n",
" NaN | \n",
" None | \n",
"
\n",
" \n",
" | 3 | \n",
" None | \n",
" None | \n",
" None | \n",
" None | \n",
" None | \n",
" <NA> | \n",
" <NA> | \n",
" <NA> | \n",
" None | \n",
" <NA> | \n",
" NaN | \n",
" None | \n",
" None | \n",
" None | \n",
" None | \n",
" None | \n",
" None | \n",
" None | \n",
" None | \n",
" None | \n",
" NaN | \n",
" NaN | \n",
" NaN | \n",
" NaN | \n",
" None | \n",
" None | \n",
" <NA> | \n",
" None | \n",
" None | \n",
" NaN | \n",
" None | \n",
"
\n",
" \n",
" | 4 | \n",
" None | \n",
" None | \n",
" None | \n",
" None | \n",
" None | \n",
" <NA> | \n",
" <NA> | \n",
" <NA> | \n",
" None | \n",
" <NA> | \n",
" NaN | \n",
" None | \n",
" None | \n",
" None | \n",
" None | \n",
" None | \n",
" None | \n",
" None | \n",
" None | \n",
" None | \n",
" NaN | \n",
" NaN | \n",
" NaN | \n",
" NaN | \n",
" None | \n",
" None | \n",
" <NA> | \n",
" None | \n",
" None | \n",
" NaN | \n",
" None | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" sentinel reel_no journal_no frame_start frame year month day time_ind \\\n",
"0 None None None None None None \n",
"1 None None None None None None \n",
"2 None None None None None None \n",
"3 None None None None None None \n",
"4 None None None None None None \n",
"\n",
" hour ship_speed compass_ind ship_course_compass blank ship_course_true \\\n",
"0 NaN None None None None \n",
"1 NaN None None None None \n",
"2 NaN None None None None \n",
"3 NaN None None None None \n",
"4 NaN None None None None \n",
"\n",
" wind_dir_mag wind_dir_true wind_force barometer temp_ind \\\n",
"0 None None None None None \n",
"1 None None None None None \n",
"2 None None None None None \n",
"3 None None None None None \n",
"4 None None None None None \n",
"\n",
" attached_thermometer air_temperature wet_bulb_temperature \\\n",
"0 NaN NaN NaN \n",
"1 NaN NaN NaN \n",
"2 NaN NaN NaN \n",
"3 NaN NaN NaN \n",
"4 NaN NaN NaN \n",
"\n",
" sea_temperature present_weather clouds sky_clear sea_state \\\n",
"0 NaN None None None \n",
"1 NaN None None None \n",
"2 NaN None None None \n",
"3 NaN None None None \n",
"4 NaN None None None \n",
"\n",
" compass_correction_ind compass_correction compass_correction_dir \n",
"0 None NaN None \n",
"1 None NaN None \n",
"2 None NaN None \n",
"3 None NaN None \n",
"4 None NaN None "
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"data_raw.c99_data5.head()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now that we have separated the c99 data into the different sections, we see that this deck is composed of two types of data, which are the same:\n",
" \n",
" - c99_data4\n",
" - c99_data5\n",
" \n",
"Both sections have the same name in variables. To map the correct section into the CDM it is necessary to impose a filter on the sections composed only of NaN data. \n",
"The problem is that we dont know which years in the time series will have a section c99_data4 and which will have a c99_data5\n",
"\n",
"> Note that this solution of excluding one section, will only work for decks from which sections are exclusive: Among the sections listed in the block, only one of them appears in every report.\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We can now use the `\"icoads_r300_d704\"` model to map the raw data to the Common Data Model [glamod/common_data_model](https://www.github.com/glamod/common_data_model). The method function `map_model` contains all the functions for the model to convert variables to the correct units and/or specification following the [CDM Documentation](https://github.com/glamod/common_data_model/blob/master/cdm_latest.pdf).\n",
"\n",
"To run the data model we need three things:\n",
"\n",
"- raw data (the data we just read above)\n",
"- attributes of the raw data (sections and column names)\n",
"- the name of the model"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"2026-05-07 08:31:39,537 - root - INFO - Initialized basic logging configuration successfully\n",
"C:\\Users\\llierham\\mobaxterm\\github_ll\\cdm_reader_mapper\\src\\cdm_reader_mapper\\cdm_mapper\\mapper.py:75: FutureWarning: 'any' with datetime64 dtypes is deprecated and will raise in a future version. Use (obj != pd.Timestamp(0)).any() instead.\n",
" list_cols = [col for col in df.columns if df[col].apply(lambda x: isinstance(x, list)).any()]\n",
"C:\\Users\\llierham\\mobaxterm\\github_ll\\cdm_reader_mapper\\src\\cdm_reader_mapper\\cdm_mapper\\mapper.py:75: FutureWarning: 'any' with datetime64 dtypes is deprecated and will raise in a future version. Use (obj != pd.Timestamp(0)).any() instead.\n",
" list_cols = [col for col in df.columns if df[col].apply(lambda x: isinstance(x, list)).any()]\n"
]
}
],
"source": [
"cdm_tables = data_bundle.map_model()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now, have we succeeded in writing some of the data to the CDM format?\n",
"\n",
"We were looking to write the following data \n",
"\n",
"## Header section\n",
"\n",
" - Platform type and sub type\n",
" - primary station id: original ship names\n",
" - Longitude and Latitudes: converted from Degrees Minutes and Hemisphere to Decimal degrees\n",
" - Location accuracy\n",
" \n",
" \n",
"## Observations tables\n",
"\n",
"- `Observations-at`: latitude, longitude and location precision\n",
"- `Observations-dpt`: latitude, longitude and location precision\n",
"- `Observations-slp`: latitude, longitude and location precision\n",
" - z_coordinate_type: Barometer height in feet converted to m.\n",
" - original units: written in the CDM code format\n",
"\n",
"- `Observations-sst`: latitude, longitude and location precision\n",
"- `Observations-wbt`: latitude, longitude and location precision\n",
"- `Observations-wd`: latitude, longitude and location precision\n",
"- `Observations-ws`: latitude, longitude and location precision\n"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" report_id | \n",
" region | \n",
" sub_region | \n",
" application_area | \n",
" observing_programme | \n",
" report_type | \n",
" station_name | \n",
" station_type | \n",
" platform_type | \n",
" platform_sub_type | \n",
" primary_station_id | \n",
" station_record_number | \n",
" primary_station_id_scheme | \n",
" longitude | \n",
" latitude | \n",
" location_accuracy | \n",
" location_method | \n",
" location_quality | \n",
" crs | \n",
" station_speed | \n",
" station_course | \n",
" station_heading | \n",
" height_of_station_above_local_ground | \n",
" height_of_station_above_sea_level | \n",
" height_of_station_above_sea_level_accuracy | \n",
" sea_level_datum | \n",
" report_meaning_of_timestamp | \n",
" report_timestamp | \n",
" report_duration | \n",
" report_time_accuracy | \n",
" report_time_quality | \n",
" report_time_reference | \n",
" profile_id | \n",
" events_at_station | \n",
" report_quality | \n",
" duplicate_status | \n",
" duplicates | \n",
" record_timestamp | \n",
" history | \n",
" processing_level | \n",
" processing_codes | \n",
" source_id | \n",
" source_record_id | \n",
"
\n",
" \n",
" \n",
" \n",
" | 0 | \n",
" ICOADS-300-020N16 | \n",
" <NA> | \n",
" <NA> | \n",
" [1, 7, 10, 11] | \n",
" [5, 7, 56] | \n",
" 0 | \n",
" Panay | \n",
" 2 | \n",
" 2 | \n",
" 26 | \n",
" Panay | \n",
" 1 | \n",
" 8 | \n",
" -68.410004 | \n",
" 42.28 | \n",
" <NA> | \n",
" <NA> | \n",
" 0 | \n",
" 0 | \n",
" 4.11552 | \n",
" 90.0 | \n",
" <NA> | \n",
" 0.0 | \n",
" 0.0 | \n",
" <NA> | \n",
" <NA> | \n",
" 2 | \n",
" 1878-10-20 06:00:00 | \n",
" 11 | \n",
" 3600.0 | \n",
" 2 | \n",
" <NA> | \n",
" <NA> | \n",
" <NA> | \n",
" 0 | \n",
" 4 | \n",
" <NA> | \n",
" 2026-05-07 06:31:39.602716+00:00 | \n",
" 2026-05-07 06:31:39. Initial conversion from I... | \n",
" <NA> | \n",
" <NA> | \n",
" ICOADS-3-0-0T-125-704-1878-10 | \n",
" 020N16 | \n",
"
\n",
" \n",
" | 1 | \n",
" ICOADS-300-020N1P | \n",
" <NA> | \n",
" <NA> | \n",
" [1, 7, 10, 11] | \n",
" [5, 7, 56] | \n",
" 0 | \n",
" Panay | \n",
" 2 | \n",
" 2 | \n",
" 26 | \n",
" Panay | \n",
" 1 | \n",
" 8 | \n",
" -68.029999 | \n",
" 42.31 | \n",
" <NA> | \n",
" <NA> | \n",
" 0 | \n",
" 0 | \n",
" 4.11552 | \n",
" 90.0 | \n",
" <NA> | \n",
" 0.0 | \n",
" 0.0 | \n",
" <NA> | \n",
" <NA> | \n",
" 2 | \n",
" 1878-10-20 08:00:00 | \n",
" 11 | \n",
" 3600.0 | \n",
" 2 | \n",
" <NA> | \n",
" <NA> | \n",
" <NA> | \n",
" 0 | \n",
" 4 | \n",
" <NA> | \n",
" 2026-05-07 06:31:39.602716+00:00 | \n",
" 2026-05-07 06:31:39. Initial conversion from I... | \n",
" <NA> | \n",
" <NA> | \n",
" ICOADS-3-0-0T-125-704-1878-10 | \n",
" 020N1P | \n",
"
\n",
" \n",
" | 2 | \n",
" ICOADS-300-020N25 | \n",
" <NA> | \n",
" <NA> | \n",
" [1, 7, 10, 11] | \n",
" [5, 7, 56] | \n",
" 0 | \n",
" Panay | \n",
" 2 | \n",
" 2 | \n",
" 26 | \n",
" Panay | \n",
" 1 | \n",
" 8 | \n",
" -67.639999 | \n",
" 42.33 | \n",
" <NA> | \n",
" <NA> | \n",
" 0 | \n",
" 0 | \n",
" 4.11552 | \n",
" 90.0 | \n",
" <NA> | \n",
" 0.0 | \n",
" 0.0 | \n",
" <NA> | \n",
" <NA> | \n",
" 2 | \n",
" 1878-10-20 10:00:00 | \n",
" 11 | \n",
" 3600.0 | \n",
" 2 | \n",
" <NA> | \n",
" <NA> | \n",
" <NA> | \n",
" 0 | \n",
" 4 | \n",
" <NA> | \n",
" 2026-05-07 06:31:39.602716+00:00 | \n",
" 2026-05-07 06:31:39. Initial conversion from I... | \n",
" <NA> | \n",
" <NA> | \n",
" ICOADS-3-0-0T-125-704-1878-10 | \n",
" 020N25 | \n",
"
\n",
" \n",
" | 3 | \n",
" ICOADS-300-020N2Q | \n",
" <NA> | \n",
" <NA> | \n",
" [1, 7, 10, 11] | \n",
" [5, 7, 56] | \n",
" 0 | \n",
" Panay | \n",
" 2 | \n",
" 2 | \n",
" 26 | \n",
" Panay | \n",
" 1 | \n",
" 8 | \n",
" -67.290001 | \n",
" 42.35 | \n",
" <NA> | \n",
" <NA> | \n",
" 0 | \n",
" 0 | \n",
" 4.11552 | \n",
" 90.0 | \n",
" <NA> | \n",
" 0.0 | \n",
" 0.0 | \n",
" <NA> | \n",
" <NA> | \n",
" 2 | \n",
" 1878-10-20 12:00:00 | \n",
" 11 | \n",
" 3600.0 | \n",
" 2 | \n",
" <NA> | \n",
" <NA> | \n",
" <NA> | \n",
" 0 | \n",
" 4 | \n",
" <NA> | \n",
" 2026-05-07 06:31:39.602716+00:00 | \n",
" 2026-05-07 06:31:39. Initial conversion from I... | \n",
" <NA> | \n",
" <NA> | \n",
" ICOADS-3-0-0T-125-704-1878-10 | \n",
" 020N2Q | \n",
"
\n",
" \n",
" | 4 | \n",
" ICOADS-300-020N3A | \n",
" <NA> | \n",
" <NA> | \n",
" [1, 7, 10, 11] | \n",
" [5, 7, 56] | \n",
" 0 | \n",
" Panay | \n",
" 2 | \n",
" 2 | \n",
" 26 | \n",
" Panay | \n",
" 1 | \n",
" 8 | \n",
" -66.900002 | \n",
" 42.37 | \n",
" <NA> | \n",
" <NA> | \n",
" 0 | \n",
" 0 | \n",
" 4.11552 | \n",
" 90.0 | \n",
" <NA> | \n",
" 0.0 | \n",
" 0.0 | \n",
" <NA> | \n",
" <NA> | \n",
" 2 | \n",
" 1878-10-20 14:00:00 | \n",
" 11 | \n",
" 3600.0 | \n",
" 2 | \n",
" <NA> | \n",
" <NA> | \n",
" <NA> | \n",
" 0 | \n",
" 4 | \n",
" <NA> | \n",
" 2026-05-07 06:31:39.602716+00:00 | \n",
" 2026-05-07 06:31:39. Initial conversion from I... | \n",
" <NA> | \n",
" <NA> | \n",
" ICOADS-3-0-0T-125-704-1878-10 | \n",
" 020N3A | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" report_id region sub_region application_area observing_programme \\\n",
"0 ICOADS-300-020N16 [1, 7, 10, 11] [5, 7, 56] \n",
"1 ICOADS-300-020N1P [1, 7, 10, 11] [5, 7, 56] \n",
"2 ICOADS-300-020N25 [1, 7, 10, 11] [5, 7, 56] \n",
"3 ICOADS-300-020N2Q [1, 7, 10, 11] [5, 7, 56] \n",
"4 ICOADS-300-020N3A [1, 7, 10, 11] [5, 7, 56] \n",
"\n",
" report_type station_name station_type platform_type platform_sub_type \\\n",
"0 0 Panay 2 2 26 \n",
"1 0 Panay 2 2 26 \n",
"2 0 Panay 2 2 26 \n",
"3 0 Panay 2 2 26 \n",
"4 0 Panay 2 2 26 \n",
"\n",
" primary_station_id station_record_number primary_station_id_scheme \\\n",
"0 Panay 1 8 \n",
"1 Panay 1 8 \n",
"2 Panay 1 8 \n",
"3 Panay 1 8 \n",
"4 Panay 1 8 \n",
"\n",
" longitude latitude location_accuracy location_method location_quality \\\n",
"0 -68.410004 42.28 0 \n",
"1 -68.029999 42.31 0 \n",
"2 -67.639999 42.33 0 \n",
"3 -67.290001 42.35 0 \n",
"4 -66.900002 42.37 0 \n",
"\n",
" crs station_speed station_course station_heading \\\n",
"0 0 4.11552 90.0 \n",
"1 0 4.11552 90.0 \n",
"2 0 4.11552 90.0 \n",
"3 0 4.11552 90.0 \n",
"4 0 4.11552 90.0 \n",
"\n",
" height_of_station_above_local_ground height_of_station_above_sea_level \\\n",
"0 0.0 0.0 \n",
"1 0.0 0.0 \n",
"2 0.0 0.0 \n",
"3 0.0 0.0 \n",
"4 0.0 0.0 \n",
"\n",
" height_of_station_above_sea_level_accuracy sea_level_datum \\\n",
"0 \n",
"1 \n",
"2 \n",
"3 \n",
"4 \n",
"\n",
" report_meaning_of_timestamp report_timestamp report_duration \\\n",
"0 2 1878-10-20 06:00:00 11 \n",
"1 2 1878-10-20 08:00:00 11 \n",
"2 2 1878-10-20 10:00:00 11 \n",
"3 2 1878-10-20 12:00:00 11 \n",
"4 2 1878-10-20 14:00:00 11 \n",
"\n",
" report_time_accuracy report_time_quality report_time_reference \\\n",
"0 3600.0 2 \n",
"1 3600.0 2 \n",
"2 3600.0 2 \n",
"3 3600.0 2 \n",
"4 3600.0 2 \n",
"\n",
" profile_id events_at_station report_quality duplicate_status duplicates \\\n",
"0 0 4 \n",
"1 0 4 \n",
"2 0 4 \n",
"3 0 4 \n",
"4 0 4 \n",
"\n",
" record_timestamp \\\n",
"0 2026-05-07 06:31:39.602716+00:00 \n",
"1 2026-05-07 06:31:39.602716+00:00 \n",
"2 2026-05-07 06:31:39.602716+00:00 \n",
"3 2026-05-07 06:31:39.602716+00:00 \n",
"4 2026-05-07 06:31:39.602716+00:00 \n",
"\n",
" history processing_level \\\n",
"0 2026-05-07 06:31:39. Initial conversion from I... \n",
"1 2026-05-07 06:31:39. Initial conversion from I... \n",
"2 2026-05-07 06:31:39. Initial conversion from I... \n",
"3 2026-05-07 06:31:39. Initial conversion from I... \n",
"4 2026-05-07 06:31:39. Initial conversion from I... \n",
"\n",
" processing_codes source_id source_record_id \n",
"0 ICOADS-3-0-0T-125-704-1878-10 020N16 \n",
"1 ICOADS-3-0-0T-125-704-1878-10 020N1P \n",
"2 ICOADS-3-0-0T-125-704-1878-10 020N25 \n",
"3 ICOADS-3-0-0T-125-704-1878-10 020N2Q \n",
"4 ICOADS-3-0-0T-125-704-1878-10 020N3A "
]
},
"execution_count": 10,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"data = cdm_tables[\"header\"]\n",
"data.head()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We now show an example of Lat and Lon"
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"(0 42.28\n",
" 1 42.31\n",
" 2 42.33\n",
" 3 42.35\n",
" 4 42.37\n",
" Name: latitude, dtype: Float64,\n",
" 0 -68.410004\n",
" 1 -68.029999\n",
" 2 -67.639999\n",
" 3 -67.290001\n",
" 4 -66.900002\n",
" Name: longitude, dtype: Float64)"
]
},
"execution_count": 11,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"data.latitude.head(), data.longitude.head()"
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" lat_deg_on | \n",
" lat_min_on | \n",
" lat_hemis_on | \n",
" lon_deg_of | \n",
" lon_min_of | \n",
" lon_hemis_of | \n",
"
\n",
" \n",
" \n",
" \n",
" | 0 | \n",
" 42 | \n",
" 20 | \n",
" N | \n",
" 66 | \n",
" 30 | \n",
" W | \n",
"
\n",
" \n",
" | 1 | \n",
" 42 | \n",
" 20 | \n",
" N | \n",
" 66 | \n",
" 30 | \n",
" W | \n",
"
\n",
" \n",
" | 2 | \n",
" 42 | \n",
" 20 | \n",
" N | \n",
" 66 | \n",
" 30 | \n",
" W | \n",
"
\n",
" \n",
" | 3 | \n",
" 42 | \n",
" 20 | \n",
" N | \n",
" 66 | \n",
" 30 | \n",
" W | \n",
"
\n",
" \n",
" | 4 | \n",
" 42 | \n",
" 20 | \n",
" N | \n",
" 66 | \n",
" 30 | \n",
" W | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" lat_deg_on lat_min_on lat_hemis_on lon_deg_of lon_min_of lon_hemis_of\n",
"0 42 20 N 66 30 W\n",
"1 42 20 N 66 30 W\n",
"2 42 20 N 66 30 W\n",
"3 42 20 N 66 30 W\n",
"4 42 20 N 66 30 W"
]
},
"execution_count": 12,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"data_raw.c99_daily[\n",
" [\n",
" \"lat_deg_on\",\n",
" \"lat_min_on\",\n",
" \"lat_hemis_on\",\n",
" \"lon_deg_of\",\n",
" \"lon_min_of\",\n",
" \"lon_hemis_of\",\n",
" ]\n",
"].head()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"This has been successfully converted to Decimal degrees with the right (-) for each hemisphere. \n",
"\n",
"\n",
"Now for the SLP we have other information:"
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" baro_type | \n",
" baro_height | \n",
" baro_units | \n",
"
\n",
" \n",
" \n",
" \n",
" | 0 | \n",
" 2 | \n",
" 14 | \n",
" 1 | \n",
"
\n",
" \n",
" | 1 | \n",
" 2 | \n",
" 14 | \n",
" 1 | \n",
"
\n",
" \n",
" | 2 | \n",
" 2 | \n",
" 14 | \n",
" 1 | \n",
"
\n",
" \n",
" | 3 | \n",
" 2 | \n",
" 14 | \n",
" 1 | \n",
"
\n",
" \n",
" | 4 | \n",
" 2 | \n",
" 14 | \n",
" 1 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" baro_type baro_height baro_units\n",
"0 2 14 1\n",
"1 2 14 1\n",
"2 2 14 1\n",
"3 2 14 1\n",
"4 2 14 1"
]
},
"execution_count": 13,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"data_raw.c99_journal[[\"baro_type\", \"baro_height\", \"baro_units\"]].head()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Baro type original code table\n",
"\n",
"```\n",
"{\n",
"\t\"1\":\"aneroid\",\n",
"\t\"2\":\"mercurial\"\n",
"}\n",
"```\n",
"Baro units original code table. It has been left like this:\n",
"\n",
"```\n",
"{\n",
"\t\"1\":\"inches\",\n",
"\t\"2\":\"millimeters\",\n",
"\t\"3\":\"millibars\",\n",
"\t\"4\":\"unable to determine\",\n",
"\t\"5\":\"Paris inches\"\n",
"}\n",
"```\n",
"\n",
"Our CDM table will be\n",
"```\n",
"{\n",
" \"1\":1001,\n",
" \"2\":1002,\n",
" \"3\":1003,\n",
" \"4\":9999,\n",
" \"5\":1005\n",
"}\n",
"```\n",
"\n",
"9999 will be the `\"fill_value\": 9999` that indicates to the CDM-mapper that these are NaN values.\n"
]
},
{
"cell_type": "code",
"execution_count": 14,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" observation_id | \n",
" report_id | \n",
" data_policy_licence | \n",
" date_time | \n",
" date_time_meaning | \n",
" observation_duration | \n",
" longitude | \n",
" latitude | \n",
" crs | \n",
" z_coordinate | \n",
" z_coordinate_type | \n",
" observation_height_above_station_surface | \n",
" observed_variable | \n",
" secondary_variable | \n",
" observation_value | \n",
" value_significance | \n",
" secondary_value | \n",
" units | \n",
" code_table | \n",
" conversion_flag | \n",
" location_method | \n",
" location_precision | \n",
" z_coordinate_method | \n",
" bbox_min_longitude | \n",
" bbox_max_longitude | \n",
" bbox_min_latitude | \n",
" bbox_max_latitude | \n",
" spatial_representativeness | \n",
" quality_flag | \n",
" numerical_precision | \n",
" sensor_id | \n",
" sensor_automation_status | \n",
" exposure_of_sensor | \n",
" original_precision | \n",
" original_units | \n",
" original_code_table | \n",
" original_value | \n",
" conversion_method | \n",
" processing_code | \n",
" processing_level | \n",
" adjustment_id | \n",
" traceability | \n",
" advanced_qc | \n",
" advanced_uncertainty | \n",
" advanced_homogenisation | \n",
" source_id | \n",
"
\n",
" \n",
" \n",
" \n",
" | 0 | \n",
" ICOADS-300-020N16-SLP | \n",
" ICOADS-300-020N16 | \n",
" 0 | \n",
" 1878-10-20 06:00:00 | \n",
" 2 | \n",
" 8 | \n",
" -68.410004 | \n",
" 42.28 | \n",
" 0 | \n",
" 4.27 | \n",
" 0 | \n",
" 4.27 | \n",
" 58 | \n",
" <NA> | \n",
" 99610.0 | \n",
" 2 | \n",
" <NA> | \n",
" 32 | \n",
" <NA> | \n",
" 0 | \n",
" <NA> | \n",
" <NA> | \n",
" <NA> | \n",
" <NA> | \n",
" <NA> | \n",
" <NA> | \n",
" <NA> | \n",
" 3 | \n",
" 2 | \n",
" <NA> | \n",
" <NA> | \n",
" 5 | \n",
" 3 | \n",
" <NA> | \n",
" 1001 | \n",
" <NA> | \n",
" 996.1 | \n",
" 7 | \n",
" <NA> | \n",
" 3 | \n",
" <NA> | \n",
" 2 | \n",
" 0 | \n",
" 0 | \n",
" 0 | \n",
" ICOADS-3-0-0T-125-704-1878-10 | \n",
"
\n",
" \n",
" | 1 | \n",
" ICOADS-300-020N1P-SLP | \n",
" ICOADS-300-020N1P | \n",
" 0 | \n",
" 1878-10-20 08:00:00 | \n",
" 2 | \n",
" 8 | \n",
" -68.029999 | \n",
" 42.31 | \n",
" 0 | \n",
" 4.27 | \n",
" 0 | \n",
" 4.27 | \n",
" 58 | \n",
" <NA> | \n",
" 99630.0 | \n",
" 2 | \n",
" <NA> | \n",
" 32 | \n",
" <NA> | \n",
" 0 | \n",
" <NA> | \n",
" <NA> | \n",
" <NA> | \n",
" <NA> | \n",
" <NA> | \n",
" <NA> | \n",
" <NA> | \n",
" 3 | \n",
" 2 | \n",
" <NA> | \n",
" <NA> | \n",
" 5 | \n",
" 3 | \n",
" <NA> | \n",
" 1001 | \n",
" <NA> | \n",
" 996.3 | \n",
" 7 | \n",
" <NA> | \n",
" 3 | \n",
" <NA> | \n",
" 2 | \n",
" 0 | \n",
" 0 | \n",
" 0 | \n",
" ICOADS-3-0-0T-125-704-1878-10 | \n",
"
\n",
" \n",
" | 2 | \n",
" ICOADS-300-020N25-SLP | \n",
" ICOADS-300-020N25 | \n",
" 0 | \n",
" 1878-10-20 10:00:00 | \n",
" 2 | \n",
" 8 | \n",
" -67.639999 | \n",
" 42.33 | \n",
" 0 | \n",
" 4.27 | \n",
" 0 | \n",
" 4.27 | \n",
" 58 | \n",
" <NA> | \n",
" 99690.0 | \n",
" 2 | \n",
" <NA> | \n",
" 32 | \n",
" <NA> | \n",
" 0 | \n",
" <NA> | \n",
" <NA> | \n",
" <NA> | \n",
" <NA> | \n",
" <NA> | \n",
" <NA> | \n",
" <NA> | \n",
" 3 | \n",
" 2 | \n",
" <NA> | \n",
" <NA> | \n",
" 5 | \n",
" 3 | \n",
" <NA> | \n",
" 1001 | \n",
" <NA> | \n",
" 996.9 | \n",
" 7 | \n",
" <NA> | \n",
" 3 | \n",
" <NA> | \n",
" 2 | \n",
" 0 | \n",
" 0 | \n",
" 0 | \n",
" ICOADS-3-0-0T-125-704-1878-10 | \n",
"
\n",
" \n",
" | 3 | \n",
" ICOADS-300-020N2Q-SLP | \n",
" ICOADS-300-020N2Q | \n",
" 0 | \n",
" 1878-10-20 12:00:00 | \n",
" 2 | \n",
" 8 | \n",
" -67.290001 | \n",
" 42.35 | \n",
" 0 | \n",
" 4.27 | \n",
" 0 | \n",
" 4.27 | \n",
" 58 | \n",
" <NA> | \n",
" 99760.0 | \n",
" 2 | \n",
" <NA> | \n",
" 32 | \n",
" <NA> | \n",
" 0 | \n",
" <NA> | \n",
" <NA> | \n",
" <NA> | \n",
" <NA> | \n",
" <NA> | \n",
" <NA> | \n",
" <NA> | \n",
" 3 | \n",
" 2 | \n",
" <NA> | \n",
" <NA> | \n",
" 5 | \n",
" 3 | \n",
" <NA> | \n",
" 1001 | \n",
" <NA> | \n",
" 997.6 | \n",
" 7 | \n",
" <NA> | \n",
" 3 | \n",
" <NA> | \n",
" 2 | \n",
" 0 | \n",
" 0 | \n",
" 0 | \n",
" ICOADS-3-0-0T-125-704-1878-10 | \n",
"
\n",
" \n",
" | 4 | \n",
" ICOADS-300-020N3A-SLP | \n",
" ICOADS-300-020N3A | \n",
" 0 | \n",
" 1878-10-20 14:00:00 | \n",
" 2 | \n",
" 8 | \n",
" -66.900002 | \n",
" 42.37 | \n",
" 0 | \n",
" 4.27 | \n",
" 0 | \n",
" 4.27 | \n",
" 58 | \n",
" <NA> | \n",
" 99920.0 | \n",
" 2 | \n",
" <NA> | \n",
" 32 | \n",
" <NA> | \n",
" 0 | \n",
" <NA> | \n",
" <NA> | \n",
" <NA> | \n",
" <NA> | \n",
" <NA> | \n",
" <NA> | \n",
" <NA> | \n",
" 3 | \n",
" 2 | \n",
" <NA> | \n",
" <NA> | \n",
" 5 | \n",
" 3 | \n",
" <NA> | \n",
" 1001 | \n",
" <NA> | \n",
" 999.2 | \n",
" 7 | \n",
" <NA> | \n",
" 3 | \n",
" <NA> | \n",
" 2 | \n",
" 0 | \n",
" 0 | \n",
" 0 | \n",
" ICOADS-3-0-0T-125-704-1878-10 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" observation_id report_id data_policy_licence \\\n",
"0 ICOADS-300-020N16-SLP ICOADS-300-020N16 0 \n",
"1 ICOADS-300-020N1P-SLP ICOADS-300-020N1P 0 \n",
"2 ICOADS-300-020N25-SLP ICOADS-300-020N25 0 \n",
"3 ICOADS-300-020N2Q-SLP ICOADS-300-020N2Q 0 \n",
"4 ICOADS-300-020N3A-SLP ICOADS-300-020N3A 0 \n",
"\n",
" date_time date_time_meaning observation_duration longitude \\\n",
"0 1878-10-20 06:00:00 2 8 -68.410004 \n",
"1 1878-10-20 08:00:00 2 8 -68.029999 \n",
"2 1878-10-20 10:00:00 2 8 -67.639999 \n",
"3 1878-10-20 12:00:00 2 8 -67.290001 \n",
"4 1878-10-20 14:00:00 2 8 -66.900002 \n",
"\n",
" latitude crs z_coordinate z_coordinate_type \\\n",
"0 42.28 0 4.27 0 \n",
"1 42.31 0 4.27 0 \n",
"2 42.33 0 4.27 0 \n",
"3 42.35 0 4.27 0 \n",
"4 42.37 0 4.27 0 \n",
"\n",
" observation_height_above_station_surface observed_variable \\\n",
"0 4.27 58 \n",
"1 4.27 58 \n",
"2 4.27 58 \n",
"3 4.27 58 \n",
"4 4.27 58 \n",
"\n",
" secondary_variable observation_value value_significance secondary_value \\\n",
"0 99610.0 2 \n",
"1 99630.0 2 \n",
"2 99690.0 2 \n",
"3 99760.0 2 \n",
"4 99920.0 2 \n",
"\n",
" units code_table conversion_flag location_method location_precision \\\n",
"0 32 0 \n",
"1 32 0 \n",
"2 32 0 \n",
"3 32 0 \n",
"4 32 0 \n",
"\n",
" z_coordinate_method bbox_min_longitude bbox_max_longitude \\\n",
"0 \n",
"1 \n",
"2 \n",
"3 \n",
"4 \n",
"\n",
" bbox_min_latitude bbox_max_latitude spatial_representativeness \\\n",
"0 3 \n",
"1 3 \n",
"2 3 \n",
"3 3 \n",
"4 3 \n",
"\n",
" quality_flag numerical_precision sensor_id sensor_automation_status \\\n",
"0 2 5 \n",
"1 2 5 \n",
"2 2 5 \n",
"3 2 5 \n",
"4 2 5 \n",
"\n",
" exposure_of_sensor original_precision original_units \\\n",
"0 3 1001 \n",
"1 3 1001 \n",
"2 3 1001 \n",
"3 3 1001 \n",
"4 3 1001 \n",
"\n",
" original_code_table original_value conversion_method processing_code \\\n",
"0 996.1 7 \n",
"1 996.3 7 \n",
"2 996.9 7 \n",
"3 997.6 7 \n",
"4 999.2 7 \n",
"\n",
" processing_level adjustment_id traceability advanced_qc \\\n",
"0 3 2 0 \n",
"1 3 2 0 \n",
"2 3 2 0 \n",
"3 3 2 0 \n",
"4 3 2 0 \n",
"\n",
" advanced_uncertainty advanced_homogenisation \\\n",
"0 0 0 \n",
"1 0 0 \n",
"2 0 0 \n",
"3 0 0 \n",
"4 0 0 \n",
"\n",
" source_id \n",
"0 ICOADS-3-0-0T-125-704-1878-10 \n",
"1 ICOADS-3-0-0T-125-704-1878-10 \n",
"2 ICOADS-3-0-0T-125-704-1878-10 \n",
"3 ICOADS-3-0-0T-125-704-1878-10 \n",
"4 ICOADS-3-0-0T-125-704-1878-10 "
]
},
"execution_count": 14,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"data_obs = cdm_tables[\"observations-slp\"]\n",
"data_obs.head()"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.13.2"
}
},
"nbformat": 4,
"nbformat_minor": 4
}