- process_id
- entity and features
- dataset_name
Optional Parameters
- process_id
- Either process_id, entity and features, dataset_name is mandatory.
Specifies the process id of an existing feature process.
- entity
- Specifies the name of the entity or object Entity to be included in the dataset.
- features
- Specifies the names of Features and the corresponding feature version to be included in the dataset.
Key is the name of the feature and value is the version of the feature.
Refer to FeatureCatalog.list_feature_versions() to get the list of features and their versions.
- dataset_name
- Specifies the dataset name.
- as_of
- Specifies the time to retrieve the Feature Values instead of retrieving the latest values.
- Applicable only when process_id is passed to the function.
- Ignored when dataset_name is passed.
- include_historic_records
- Specifies whether to include historic data in the dataset.
If "as_of" is specified, then the "include_historic_records" argument is ignored.
Default value: False.
Example setup
>>> from teradataml import DataFrame, FeatureStore, load_example_data
Create DataFrame on sales data.
>>> load_example_data("dataframe", "sales")
>>> df = DataFrame("sales")
>>> df
Feb Jan Mar Apr datetime accounts Orange Inc 210.0 NaN NaN 250.0 04/01/2017 Jones LLC 200.0 150.0 140.0 180.0 04/01/2017 Blue Inc 90.0 50.0 95.0 101.0 04/01/2017 Alpha Co 210.0 200.0 215.0 250.0 04/01/2017 Yellow Inc 90.0 NaN NaN NaN 04/01/2017
Create FeatureStore 'vfs_v1' or use existing one.
>>> repo = 'vfs_v1' >>> data_domain = 'sales' >>> fs = FeatureStore(repo=repo, data_domain=data_domain)
FeatureStore is ready to use.
Example 1: Get the data from process_id
Create a feature process.
>>> fp = FeatureProcess(repo=repo, ... data_domain=data_domain, ... object=df, ... entity='accounts', ... features=['Jan', 'Feb']) >>> fp.run()
Process '1e9e8d64-6851-11f0-99c5-a30631e77953' started. Process '1e9e8d64-6851-11f0-99c5-a30631e77953' completed. True
Get data from FeatureStore.
>>> fs.get_data(process_id=fp.process_id)
accounts Feb Jan 0 Alpha Co 210.0 200.0 1 Blue Inc 90.0 50.0 2 Jones LLC 200.0 150.0 3 Orange Inc 210.0 NaN 4 Yellow Inc 90.0 NaN 5 Red Inc 200.0 150.0
Example 2: Get the data from entity and features
>>> fs.get_data(entity='accounts', features={'Jan': fp.process_id})
accounts Jan 0 Alpha Co 200.0 1 Blue Inc 50.0 2 Jones LLC 150.0 3 Orange Inc NaN 4 Yellow Inc NaN 5 Red Inc 150.0
Example 3: Get the data from dataset name
Build the dataset.
>>> dc = DatasetCatalog(repo=repo, data_domain=data_domain)
>>> dc.build_dataset(entity='accounts',
... selected_features={'Jan': fp.process_id,
... 'Feb': fp.process_id},
... view_name='test_get_data',
... description='Dataset with Jan and Feb')
Get data from the dataset.
>>> fs.get_data(dataset_name='test_get_data')
accounts Feb Jan 0 Alpha Co 210.0 200.0 1 Blue Inc 90.0 50.0 2 Jones LLC 200.0 150.0 3 Orange Inc 210.0 NaN 4 Yellow Inc 90.0 NaN 5 Red Inc 200.0 150.0
Example 4: Get the data from Entity and Features, where entity object and feature objects passed to the entity and features arguments
Create features.
>>> feature1 = Feature('sales:Mar',
... df.Mar,
... feature_type=FeatureType.CATEGORICAL)
>>> feature2 = Feature('sales:Apr',
... df.Apr,
... feature_type=FeatureType.CONTINUOUS)
Create an entity.
>>> entity = Entity(name='accounts_entity', columns=['accounts'])
Create a feature process.
>>> fp1 = FeatureProcess(repo=repo, ... data_domain=data_domain, ... object=df, ... entity=entity, ... features=[feature1, feature2]) >>> fp1.run()
Process '5522c034-684d-11f0-99c5-a30631e77953' started. Process '5522c034-684d-11f0-99c5-a30631e77953' completed. True
Get data from the entity and features.
>>> fs.get_data(entity=entity, features={feature1.name: fp1.process_id,
... feature2.name: fp1.process_id})
accounts sales:Mar sales:Apr 0 Alpha Co 215.0 250.0 1 Blue Inc 95.0 101.0 2 Jones LLC 140.0 180.0 3 Orange Inc NaN 250.0 4 Yellow Inc NaN NaN 5 Red Inc 140.0 NaN
Example 5: Get the data for the time passed by the user via the as_of argument
Import required packages.
>>> import time >>> from datetime import datetime as dt, date as d
Retrieve the record where accounts == 'Blue Inc'.
>>> df_test = df[df['accounts'] == 'Blue Inc'] >>> df_test
Feb Jan Mar Apr datetime accounts Blue Inc 90.0 50.0 95.0 101.0 04/01/2017
>>> df_test.to_sql('sales_test', if_exists='replace')
>>> test_df = DataFrame('sales_test')
>>> test_df
accounts Feb Jan Mar Apr datetime 0 Blue Inc 90.0 50 95 101 17/01/04
Create a feature process.
>>> fp = FeatureProcess(repo=repo, ... data_domain=data_domain, ... object=test_df, ... entity='accounts', ... features=['Jan', 'Feb'])
Run the feature process.
>>> fp.run()
Process '6cb49b4b-79d4-11f0-8c5e-b0dcef8381ea' started. Process '6cb49b4b-79d4-11f0-8c5e-b0dcef8381ea' completed. True
This example runs the same process more than once to demonstrate how you can retrieve a specific version of Features using argument 'as_of'.
Wait for 20 seconds, update the data, then run again.
>>> time.sleep(20)
>>> execute_sql("update sales_test set Jan = Jan * 10, Feb = Feb * 10")
TeradataCursor uRowsHandle=269 bClosed=False
Run the feature process again.
>>> fp.run()
Process '6cb49b4b-79d4-11f0-8c5e-b0dcef8381ea' started. Process '6cb49b4b-79d4-11f0-8c5e-b0dcef8381ea' completed. True
Wait again for 20 seconds, update the data, then run again.
>>> time.sleep(20)
>>> execute_sql("update sales_test set Jan = Jan * 10, Feb = Feb * 10")
TeradataCursor uRowsHandle=397 bClosed=False
Run the feature process again.
>>> fp.run()
Process '6cb49b4b-79d4-11f0-8c5e-b0dcef8381ea' started. Process '6cb49b4b-79d4-11f0-8c5e-b0dcef8381ea' completed. True
Retrieve specific version of Features at '2025-08-15 12:37:23'. The time passed to as_of is in datetime.datetime format.
>>> as_of_time = dt(2025, 8, 15, 12, 37, 23)
>>> fs.get_data(process_id=fp.process_id, ... as_of=as_of_time)
accounts Feb Jan 0 Blue Inc 900.0 500
>>> fs.get_data(process_id=fp.process_id,
... as_of=as_of_time.strftime('%Y-%m-%d %H:%M:%S'))
accounts Feb Jan 0 Blue Inc 900.0 500
Example 6: Get the data for the time passed by the user via the as_of argument by sourcing entity and features
Time is passed to the as_of argument in datetime.datetime format.
>>> fs.get_data(entity='accounts',
... features={'Feb': fp.process_id,
... 'Jan': fp.process_id},
... as_of=as_of_time)
accounts Feb Jan 0 Blue Inc 900.0 500
Time is passed to the as_of argument in string format.
>>> fs.get_data(entity='accounts',
... features={'Feb': fp.process_id,
... 'Jan': fp.process_id},
... as_of=as_of_time.strftime('%Y-%m-%d %H:%M:%S'))
accounts Feb Jan 0 Blue Inc 900.0 500
Example 7: Get the latest data for the given process_id
>>> fs.get_data(process_id=fp.process_id, include_historic_records=False)
accounts Feb Jan 0 Blue Inc 9000.0 5000
Example 8: Get the historic data for the given process_id
>>> fs.get_data(process_id=fp.process_id, include_historic_records=True)
accounts Feb Jan 0 Blue Inc 9000.0 5000 1 Blue Inc 90.0 50 2 Blue Inc 90.0 5000 3 Blue Inc 900.0 500 4 Blue Inc 900.0 5000 5 Blue Inc 900.0 50 6 Blue Inc 90.0 500 7 Blue Inc 9000.0 50 8 Blue Inc 9000.0 500
Example 9: Get the latest data for the given feature
>>> fs.get_data(entity='accounts', features={'Feb': fp.process_id}, include_historic_records=False)
accounts Feb 0 Blue Inc 9000.0
Example 10: Get the historic data for the given feature
>>> fs.get_data(entity='accounts', features={'Feb': fp.process_id}, include_historic_records=True)
accounts Feb 0 Blue Inc 900.0 1 Blue Inc 90.0 2 Blue Inc 9000.0