midrc_react.plugins package
Submodules
midrc_react.plugins.midrc_tsv_loader module
This module contains functions for processing TSV files downloaded from the data.MIDRC.org website.
- midrc_react.plugins.midrc_tsv_loader.adjust_age(df)
Modifies the age_at_index column based on age_at_index_gt89.
- midrc_react.plugins.midrc_tsv_loader.adjust_column_names(df)
Adjusts column names to be more readable.
- midrc_react.plugins.midrc_tsv_loader.combine_race_ethnicity(df)
Combines ‘race’ and ‘ethnicity’ columns into a new ‘Race and Ethnicity’ column.
Criteria: - If either ‘race’ or ‘ethnicity’ is ‘Not Reported’, the new column contains ‘Not Reported’. - If ‘ethnicity’ is ‘Hispanic or Latino’, the new column contains ‘Hispanic or Latino’. - If ‘ethnicity’ is ‘Not Hispanic or Latino’, the new column contains ‘{race}, {ethnicity}’.
- Parameters:
df (pd.DataFrame) – Input DataFrame with ‘race’ and ‘ethnicity’ columns.
- Returns:
Modified DataFrame with the new ‘Race and Ethnicity’ column.
- Return type:
pd.DataFrame
- midrc_react.plugins.midrc_tsv_loader.extract_earliest_date(submitter_id_series)
Extracts the earliest date from datasets.submitter_id column.
- midrc_react.plugins.midrc_tsv_loader.preprocess_data(df)
Preprocesses a pandas DataFrame.
- midrc_react.plugins.midrc_tsv_loader.process_dataframe(df)
Applies both transformations on a pandas DataFrame.
- midrc_react.plugins.midrc_tsv_loader.process_dataframe_to_dataframe(df)
Processes a pandas DataFrame and returns a modified DataFrame.
- midrc_react.plugins.midrc_tsv_loader.process_tsv_to_dataframe(input_file)
Reads a TSV file, processes it, and returns a pandas DataFrame.
- midrc_react.plugins.midrc_tsv_loader.process_tsv_to_tsv(input_file, output_file)
Reads a TSV file, processes it, and writes back to a new TSV file.