cdm_reader_mapper.split_by_column_entries#
- cdm_reader_mapper.split_by_column_entries(data, selection, reset_index=False, inverse=False, return_rejected=False)[source]#
Split a DataFrame based on matching values in a given column.
- Parameters:
data (
pandas.DataFrame) – DataFrame to be split.selection (
dict) – Mapping of a column name to an iterable of allowed values. Example:{"city": ["London", "Berlin"]}.reset_index (
bool, optional) – Whether to reset index in returned DataFrames.inverse (
bool, optional) – IfTrue, invert the selection.return_rejected (
bool, optional) – IfTrue, return rejected rows as the second output. IfFalse, the rejected output is empty but dtype-preserving.
- Return type:
tuple[DataFrame|ParquetStreamReader,DataFrame|ParquetStreamReader,Index|MultiIndex,Index|MultiIndex]- Returns:
(pandas.DataFrameorParquetStreamReader,pandas.DataFrameorParquetStreamReader,pd.Indexorpd.MultiIndex,pd.Indexorpd.MultiIndex)– Selected rows (all mask columns True), rejected rows, original indexes of selection and original indexes of rejection.