cdm_reader_mapper.DataBundle.remove_duplicates

cdm_reader_mapper.DataBundle.remove_duplicates#

DataBundle.remove_duplicates(inplace=False, **kwargs)[source]#

Remove detected duplicates in data.

Parameters:

inplace (bool) – If True overwrite data in DataBundle else return a copy of DataBundle with data containing no duplicates. Default: False

Return type:

DataBundle | None

Returns:

DataBundle or None – DataBundle without duplicated rows or None if inplace=True.

Note

Before removing duplicates, a duplictate check has to be done, DataBundle.duplicate_check().

Examples

Remove duplicates without overwriting data.

>>> removed_tables = db.remove_duplicates()

Remove duplicates with overwriting data.

>>> db.remove_duplicates(inplace=True)
>>> removed_tables = db.data

See also

DataBundle.flag_duplicates

Flag detected duplicates in data.

DataBundle.get_duplicates

Get duplicate matches in data.

DataBundle.duplicate_check

Duplicate check in data.

Note

For more information see DupDetect.remove_duplicates()