midrc_react.plugins package

Submodules

midrc_react.plugins.midrc_tsv_loader module

This module contains functions for processing TSV files downloaded from the data.MIDRC.org website.

midrc_react.plugins.midrc_tsv_loader.adjust_age(df)

Modifies the age_at_index column based on age_at_index_gt89.

midrc_react.plugins.midrc_tsv_loader.adjust_column_names(df)

Adjusts column names to be more readable.

midrc_react.plugins.midrc_tsv_loader.combine_race_ethnicity(df)

Combines ‘race’ and ‘ethnicity’ columns into a new ‘Race and Ethnicity’ column.

Criteria: - If either ‘race’ or ‘ethnicity’ is ‘Not Reported’, the new column contains ‘Not Reported’. - If ‘ethnicity’ is ‘Hispanic or Latino’, the new column contains ‘Hispanic or Latino’. - If ‘ethnicity’ is ‘Not Hispanic or Latino’, the new column contains ‘{race}, {ethnicity}’.

Parameters:

df (pd.DataFrame) – Input DataFrame with ‘race’ and ‘ethnicity’ columns.

Returns:

Modified DataFrame with the new ‘Race and Ethnicity’ column.

Return type:

pd.DataFrame

midrc_react.plugins.midrc_tsv_loader.extract_earliest_date(submitter_id_series)

Extracts the earliest date from datasets.submitter_id column.

midrc_react.plugins.midrc_tsv_loader.preprocess_data(df)

Preprocesses a pandas DataFrame.

midrc_react.plugins.midrc_tsv_loader.process_dataframe(df)

Applies both transformations on a pandas DataFrame.

midrc_react.plugins.midrc_tsv_loader.process_dataframe_to_dataframe(df)

Processes a pandas DataFrame and returns a modified DataFrame.

midrc_react.plugins.midrc_tsv_loader.process_tsv_to_dataframe(input_file)

Reads a TSV file, processes it, and returns a pandas DataFrame.

midrc_react.plugins.midrc_tsv_loader.process_tsv_to_tsv(input_file, output_file)

Reads a TSV file, processes it, and writes back to a new TSV file.

Module contents