FeatureGroup contains details of the Feature Group. You can create mutiple Feature Groups, as well as combine Feature Groups to create a new Feature Group. See Combining one or more Feature Groups into a single FeatureGroup.
The following example explains a Feature Group for sales data.
>>> df = DataFrame("sales") >>> df
Feb Jan Mar Apr datetime accounts Orange Inc 210.0 NaN NaN 250.0 04/01/2017 Jones LLC 200.0 150.0 140.0 180.0 04/01/2017 Blue Inc 90.0 50.0 95.0 101.0 04/01/2017 Alpha Co 210.0 200.0 215.0 250.0 04/01/2017 Yellow Inc 90.0 NaN NaN NaN 04/01/2017
Create a FeatureGroup for 'sales' DataFrame.
>>> from teradataml import DataSource, Entity, Feature, FeatureGroup >>> f1 = Feature('sales_jan', df.Jan, description='January sales') >>> f2 = Feature('sales_feb', df.Feb, description='February sales') >>> f3 = Feature('sales_mar', df.Mar, description='March sales') >>> f4 = Feature('sales_apr', df.Apr, description='April sales') >>> entity=Entity('sales', df.accounts) >>> ds=DataSource('sales', df, timestamp_col_name='datetime') >>> fg = FeatureGroup('sales', features=[f1, f2, f3, f4], entity=entity, data_source=ds, description='sales group')
>>> fg FeatureGroup(sales, features=[Feature(name=sales_jan), Feature(name=sales_feb), Feature(name=sales_mar), Feature(name=sales_apr)], entity=Entity(name=sales), data_source=DataSource(name=sales)) >>>
Properties
- features
- Returns a list of features from FeatureGroup.
Example getting the associated features from FeatureGroup 'fg':
>>> fg.features [Feature(name=sales_jan), Feature(name=sales_feb), Feature(name=sales_mar), Feature(name=sales_apr)] >>>
- labels
- Returns the list of features marked as labels from FeatureGroup.Check methods set_labels() and reset_labels() to set and reset Features as labels for the current session.
Example:
>>> fg.labels []
- entity
- Returns the Entity from FeatureGroup.
Example:
>>> fg.entity Entity(name=sales) >>>
- data_source
- Returns the DataSource from FeatureGroup.
Example:
>>> fg.data_source DataSource(name=sales) >>>
- description
- Returns the description for FeatureGroup.
Example:
>>> fg.description 'sales group' >>>
Methods
- from_DataFrame
- Creates FeatureGroup from teradataml DataFrame.
Example creating a FeatureGroup from 'sale's DataFrame, and specifying 'accounts' column as entity and 'datetime' column as timestamp column:
>>> df = DataFrame("sales") >>> df
Feb Jan Mar Apr datetime accounts Orange Inc 210.0 NaN NaN 250.0 04/01/2017 Jones LLC 200.0 150.0 140.0 180.0 04/01/2017 Blue Inc 90.0 50.0 95.0 101.0 04/01/2017 Alpha Co 210.0 200.0 215.0 250.0 04/01/2017 Yellow Inc 90.0 NaN NaN NaN 04/01/2017
>>> fg = FeatureGroup.from_DataFrame( ... name='sales', ... entity_columns='accounts', ... df=df, ... timestamp_col_name='datetime' ... )
>>> fg FeatureGroup(sales, features=[Feature(name=Feb), Feature(name=Jan), Feature(name=Mar), Feature(name=Apr)], entity=Entity(name=sales), data_source=DataSource(name=sales)) >>>
- from_query
- Creates FeatureGroup from SQL query.
Example creating a FeatureGroup from query 'SELECT * FROM SALES', and specifying 'accounts' column as entity and 'datetime' column as timestamp column:
>>> query = 'SELECT * FROM SALES' >>> fg = FeatureGroup.from_query( ... name='sales', ... entity_columns='accounts', ... query=query, ... timestamp_col_name='datetime' ... )
>>> fg FeatureGroup(sales, features=[Feature(name=Feb), Feature(name=Jan), Feature(name=Mar), Feature(name=Apr)], entity=Entity(name=sales), data_source=DataSource(name=sales)) >>>
- apply
- Updates FeatureGroup with Feature, Entity, or DataSource. You can also use this method to add new Features to an existing FeatureGroup.
Example creates a FeatureGroup for January sales, creates a new Feature, and adds it to the FeatureGroup:
>>> df = DataFrame("sales")
>>> from teradataml import DataSource, Entity, Feature, FeatureGroup >>> f1 = Feature('sales_jan', df.Jan, description='January sales') >>> entity = Entity('sales', df.accounts) >>> ds = DataSource('sales', df, timestamp_col_name='datetime') >>> fg = FeatureGroup('sales', features=f1, entity=entity, data_source=ds)
>>> fg FeatureGroup(sales, features=[Feature(name=sales_jan)], entity=Entity(name=sales), data_source=DataSource(name=sales))
>>> f2 = Feature('sales_feb', df.Feb, description='February sales')
>>> fg.apply(f2) True
>>> fg FeatureGroup(sales, features=[Feature(name=sales_jan), Feature(name=sales_feb)], entity=Entity(name=sales), data_source=DataSource(name=sales))
- remove
- Removes a Feature, Entity, or DataSource from a FeatureGroup.
Example:
>>> fg.remove(f2) True
>>> fg FeatureGroup(sales, features=[Feature(name=sales_jan)], entity=Entity(name=sales), data_source=DataSource(name=sales))
- set_labels
- Sets the one or more Feature as labels for FeatureGroup in the current session.
If you lose the session, you need to set the labels again in a new session.
Use this method to refer to FeatureGroup in your ML models.
Example creating a FeatureGroup from query 'SELECT * FROM SALES', and specifying 'accounts' column as entity column and 'datetime' column as timestamp column:
>>> query = 'SELECT * FROM SALES'
>>> fg = FeatureGroup.from_query( ... name='sales', ... entity_columns='accounts', ... query=query, ... timestamp_col_name='datetime' ... )
>>> fg.features [Feature(name=Feb), Feature(name=Jan), Feature(name=Mar), Feature(name=Apr)] >>> fg.labels []
Set 'Apr' as label to generate an ML model to predict April sales from Jan, Feb, and Mar sales.
>>> fg.set_labels('Apr') True
>>> fg.features [Feature(name=Feb), Feature(name=Jan), Feature(name=Mar)]
>>> fg.labels Feature(name=Apr)
- reset_labels
- Removes all labels set for FeatureGroup in the current session.
Example that resets labels, checks features, and resets labels again:
>>> fg.reset_labels() True
>>> fg.features [Feature(name=Feb), Feature(name=Jan), Feature(name=Mar), Feature(name=Apr)] >>> fg.labels []
>>> fg.labels []