FeatureProcess Class | Teradata Package for Python - FeatureProcess Class - Teradata Package for Python

Teradata® Package for Python User Guide

Deployment
VantageCloud
VantageCore
Edition
VMware
Enterprise
IntelliFlex
Product
Teradata Package for Python
Release Number
20.00
Published
March 2025
ft:locale
en-US
ft:lastEdition
2025-12-05
dita:mapPath
nvi1706202040305.ditamap
dita:ditavalPath
plt1683835213376.ditaval
dita:id
rkb1531260709148
Product Category
Teradata Vantage

Use the FeatureProcess class to run the feature processing workflow, which ingests feature values into the feature catalog.

Syntax

FeatureProcess(repo, object, entity=None, features=None, data_domain=None, description=None)

Required Parameters

repo
Specifies the name of the database where the ingested feature values are stored.

Feature store should be set up on the database before running feature process. Use FeatureStore.setup() to set up the feature store on the repo database.

object
Specifies the source to ingest feature values. It can be one of the following:
  • teradataml DataFrame
  • Feature group
  • Process id
  • If object is of type teradataml DataFrame, then entity and features should be provided.
  • If object is of type str, then it is considered as process id of an existing FeatureProcess and reruns the process. Entity and features are taken from the existing feature process. Hence, entity and features are ignored.
  • If object is of type FeatureGroup, then entity and features are taken from the FeatureGroup. Hence, entity and features are ignored.

Optional Parameters

entity
Specifies Entity for DataFrame.
  • Ignored when object is of type FeatureGroup or str.
  • If a string or list of strings is provided, then object should have these columns in it.
  • If Entity object is provided, then associated columns in Entity object should be present in DataFrame.
features
Specifies list of features to be considered in feature process. Feature ingestion takes place only for these features.

Ignored when object is of type FeatureGroup or str.

data_domain
Specifies the data domain for the feature process. If data_domain is not specified, then default database is considered as the data domain.
description
Specifies description for the FeatureProcess.

Example setup

>>> load_example_data("dataframe", "sales")
>>> df = DataFrame("sales")

Create a feature store.

>>> fs = FeatureStore(repo='vfs_v1', data_domain='sales')
Repo vfs_v1 does not exist. Run FeatureStore.setup() to create the repo and setup FeatureStore.
>>> fs.setup()
True

Example 1: Create a FeatureProcess to ingest features "Jan", "Feb", "Mar" and "Apr" using DataFrame 'df'. Use 'accounts' column as entity

Ingest the features to data domain 'sales'.

>>> fp = FeatureProcess(repo="vfs_test",
...                     data_domain='sales',
...                     object=df,
...                     entity="accounts",
...                     features=["Jan", "Feb", "Mar", "Apr"])
>>> fp.run()
Process 'e0cdbca3-5c80-11f0-8b86-f020ffe7fe09' started.
Process 'e0cdbca3-5c80-11f0-8b86-f020ffe7fe09' completed.
True

Example 2: Create a FeatureProcess to ingest features using feature group

Ingest the features to default data domain.

>>> fg = FeatureGroup.from_DataFrame(name="sales", entity_columns="accounts", df=df)
>>> fp = FeatureProcess(repo="vfs_test", object=fg)

Example 3: Create a FeatureProcess to ingest features using process id

Run example 1 first to create process id. Then use process id to run process again. Alternatively, you can use FeatureStore.list_feature_process() to get the list of existing process IDs.

>>> fp1 = FeatureProcess(repo="vfs_test", object=df, entity="accounts", features=["Jan", "Feb", "Mar", "Apr"])
>>> fp1.run()
Process '593c3326-33cb-11f0-8459-f020ff57c62c' started.
Process '593c3326-33cb-11f0-8459-f020ff57c62c' completed.

Run the process again using process id.

>>> fp = FeatureProcess(repo="vfs_test", object=fp1.process_id)

Example 4: Ingest the sales features 'Jan' and 'Feb' for only entity 'Blue Inc' to the 'sales' data domain

Use 'accounts' column as entity.

>>> fp = FeatureProcess(repo="vfs_test",
...                     data_domain='sales',
...                     object=df,
...                     entity='accounts',
...                     features=['Jan', 'Feb'])
>>> fp.run(filters=df.accounts=='Blue Inc')
Process '7b9f76d6-562c-11f0-bb98-c934b24a960f' started.
Ingesting the features for filter 'accounts = 'Blue Inc'' to catalog.
Process '7b9f76d6-562c-11f0-bb98-c934b24a960f' completed.
True

Verify the ingested feature values.

>>> fs = FeatureStore(repo='vfs_v1', data_domain='sales')
FeatureStore is ready to use.
>>> fs.list_feature_catalogs()
            data_domain  feature_id                                 table_name                     valid_start                       valid_end
entity_name                                                                                                                                  
accounts          sales           1  FS_T_a38baff6_821b_3bb7_0850_827fe5372e31  2025-07-09 05:08:37.500000+00:  9999-12-31 23:59:59.999999+00:
accounts          sales           2  FS_T_6003dc24_375e_7fd6_46f0_eeb868305c4a  2025-07-09 05:08:37.500000+00:  9999-12-31 23:59:59.999999+00:

Verify the data.

>>> DataFrame(in_schema('vfs_v1', 'FS_T_6003dc24_375e_7fd6_46f0_eeb868305c4a'))
          feature_id  feature_value                       feature_version                     valid_start                       valid_end                     ValidPeriod
accounts                                                                                                                                                                
Blue Inc           2           90.0  c0a7704a-5c82-11f0-812f-f020ffe7fe09  2025-07-09 05:08:43.890000+00:  9999-12-31 23:59:59.999999+00:  ('2025-07-09 05:08:43.890000+0
>>> DataFrame(in_schema('vfs_v1', 'FS_T_a38baff6_821b_3bb7_0850_827fe5372e31'))
          feature_id  feature_value                       feature_version                     valid_start                       valid_end                     ValidPeriod
accounts                                                                                                                                                                
Blue Inc           1             50  c0a7704a-5c82-11f0-812f-f020ffe7fe09  2025-07-09 05:08:43.890000+00:  9999-12-31 23:59:59.999999+00:  ('2025-07-09 05:08:43.890000+0

Example 5: Create a FeatureProcess to ingest features "Jan_v2", "Feb_v2" using DataFrame 'df'

Use 'accounts' column as entity.

Ingest the features to data domain 'sales'.

>>> jan_feature = Feature('Jan_v2',
...                       df.Jan,
...                       feature_type=FeatureType.CATEGORICAL)
>>> feb_feature = Feature('Feb_v2',
...                        df.Feb,
...                        feature_type=FeatureType.CATEGORICAL)
>>> entity = Entity(name='accounts_v2', columns='accounts')
>>> fp = FeatureProcess(repo="vfs_test",
...                     data_domain='sales',
...                     object=df,
...                     entity=entity,
...                     features=[jan_feature, feb_feature])
>>> fp.run()
Process '587b9a68-7b57-11f0-abc5-a188eb171d46' started.
Process '587b9a68-7b57-11f0-abc5-a188eb171d46' completed.
True