You can combine multiple FeatureGroup objects using the + operator. Combining multiple FeatureGroup objects creates a new FeatureGroup object.
Prerequisites for combining multiple Feature Groups:
- Entities must be identical.
- If the Data Source includes timestamp column details, then both data sources should have same timestamp column names..
Let’s look at an example to see how this works.
Create two different FeatureGroups
Before creating two FeatureGroups, let’s look at two different data sets: patient_profile and medical_readings.
Patient Profile
>>> load_example_data('dataframe', 'patient_profile') >>> patient_profile_df = DataFrame('patient_profile') >>> patient_profile_df
record_timestamp pregnancies age bmi skin_thickness patient_id 17 2024-04-10 11:10:59.000000 7 31 29.6 0.0 34 2024-04-10 11:10:59.000000 10 45 27.6 31.0 13 2024-04-10 11:10:59.000000 1 59 30.1 23.0 53 2024-04-10 11:10:59.000000 8 58 33.7 34.0 11 2024-04-10 11:10:59.000000 10 34 38.0 0.0 51 2024-04-10 11:10:59.000000 1 26 24.2 15.0 32 2024-04-10 11:10:59.000000 3 22 24.8 11.0 15 2024-04-10 11:10:59.000000 7 32 30.0 0.0 99 2024-04-10 11:10:59.000000 1 31 49.7 51.0 0 2024-04-10 11:10:59.000000 6 50 33.6 35.0
Medical Readings
>>> load_example_data('dataframe', 'medical_readings') >>> medical_readings_df = DataFrame('medical_readings') >>> medical_readings_df
record_timestamp glucose blood_pressure insulin diabetes_pedigree_function outcome patient_id 17 2024-04-10 11:10:59.000000 107 74 0 0.254 1 34 2024-04-10 11:10:59.000000 122 78 0 0.512 0 13 2024-04-10 11:10:59.000000 189 60 846 0.398 1 53 2024-04-10 11:10:59.000000 176 90 300 0.467 1 11 2024-04-10 11:10:59.000000 168 74 0 0.537 1 51 2024-04-10 11:10:59.000000 101 50 36 0.526 0 32 2024-04-10 11:10:59.000000 88 58 54 0.267 0 15 2024-04-10 11:10:59.000000 100 0 0 0.484 1 99 2024-04-10 11:10:59.000000 122 90 220 0.325 1 0 2024-04-10 11:10:59.000000 148 72 0 0.627 1 >>>
Create two FeatureGroups for the two datasets
Let's first create individual FeatureGroups.
>>> patient_profile_fg = FeatureGroup.from_DataFrame( ... name='PatientProfile', ... df=patient_profile_df, ... entity_columns='patient_id', ... timestamp_col_name='record_timestamp' )
>>> medical_readings_fg = FeatureGroup.from_DataFrame( ... name='MedicalReadings', ... df=medical_readings_df, ... entity_columns='patient_id', ... timestamp_col_name='record_timestamp' )
>>> print(patient_profile_fg.features) [Feature(name=pregnancies), Feature(name=age), Feature(name=bmi), Feature(name=skin_thickness)]
>>> print(medical_readings_fg.features) [Feature(name=glucose), Feature(name=blood_pressure), Feature(name=insulin), Feature(name=diabetes_pedigree_function), Feature(name=outcome)]
Combine the two FeatureGroups
>>> new_fg = patient_profile_fg + medical_readings_fg
Examine the combined FeatureGroup properties
>>> print(new_fg.name) 'PatientProfile_MedicalReadings'
>>> print(new_fg.features) [Feature(name=pregnancies), Feature(name=age), Feature(name=bmi), Feature(name=skin_thickness), Feature(name=glucose), Feature(name=blood_pressure), Feature(name=insulin), Feature(name=diabetes_pedigree_function), Feature(name=outcome)] >>> print(new_fg.entity) Entity(name=PatientProfile_MedicalReadings)
>>> print(new_fg.data_source) DataSource(name=PatientProfile_MedicalReadings)