Teradata Package for Python Function Reference on VantageCloud Lake - get_data - Teradata Package for Python - Look here for syntax, methods and examples for the functions included in the Teradata Package for Python.
Teradata® Package for Python Function Reference on VantageCloud Lake
- Deployment
- VantageCloud
- Edition
- Lake
- Product
- Teradata Package for Python
- Release Number
- 20.00.00.08
- Published
- November 2025
- ft:locale
- en-US
- ft:lastEdition
- 2025-12-05
- dita:id
- TeradataPython_FxRef_Lake_2000
- Product Category
- Teradata Vantage
- teradataml.store.feature_store.feature_store.FeatureStore.get_data = get_data(self, process_id=None, entity=None, features=None, dataset_name=None, as_of=None, include_historic_records=False)
- DESCRIPTION:
Returns teradataml DataFrame which has entities and feature values.
Method generates dataset from following -
* process_id
* entity and features
* dataset_name
PARAMETERS:
process_id:
Optional Argument.
Either "process_id", "entity" and "features", "dataset_name" is mandatory.
Specifies the process id of an existing feature process.
Types: str
entity:
Optional Argument.
Specifies the name of the Entity or Object of Entity
to be considered in the dataset.
Types: str or Entity.
features:
Optional Argument.
Specifies the names of Features and the corresponding feature version
to be included in the dataset.
Notes:
* Key is the name of the feature and value is the version of the
feature.
* Look at FeatureCatalog.list_feature_versions() to get the list of
features and their versions.
Types: dict
dataset_name:
Optional Argument.
Specifies the dataset name.
Types: str
as_of:
Optional Argument.
Specifies the time to retrieve the Feature Values instead of
retrieving the latest values.
Notes:
* Applicable only when "process_id" is passed to the function.
* Ignored when "dataset_name" is passed.
Types: str or datetime.datetime
include_historic_records:
Optional Argument.
Specifies whether to include historic data in the dataset.
Note:
* If "as_of" is specified, then the "include_historic_records" argument is ignored.
Default Value: False.
Types: bool.
RETURNS:
teradataml DataFrame.
RAISES:
TeradataMLException
EXAMPLES:
>>> from teradataml import DataFrame, FeatureStore, load_example_data
# Create DataFrame on sales data.
>>> load_example_data("dataframe", "sales")
>>> df = DataFrame("sales")
>>> df
Feb Jan Mar Apr datetime
accounts
Orange Inc 210.0 NaN NaN 250.0 04/01/2017
Jones LLC 200.0 150.0 140.0 180.0 04/01/2017
Blue Inc 90.0 50.0 95.0 101.0 04/01/2017
Alpha Co 210.0 200.0 215.0 250.0 04/01/2017
Yellow Inc 90.0 NaN NaN NaN 04/01/2017
>>> repo = 'vfs_v1'
>>> data_domain = 'sales'
>>> fs = FeatureStore(repo=repo, data_domain=data_domain)
FeatureStore is ready to use.
# Example 1: Get the data from process_id.
>>> fp = FeatureProcess(repo=repo,
... data_domain=data_domain,
... object=df,
... entity='accounts',
... features=['Jan', 'Feb'])
>>> fp.run()
Process '1e9e8d64-6851-11f0-99c5-a30631e77953' started.
Process '1e9e8d64-6851-11f0-99c5-a30631e77953' completed.
True
>>> fs.get_data(process_id=fp.process_id)
accounts Feb Jan
0 Alpha Co 210.0 200.0
1 Blue Inc 90.0 50.0
2 Jones LLC 200.0 150.0
3 Orange Inc 210.0 NaN
4 Yellow Inc 90.0 NaN
5 Red Inc 200.0 150.0
# Example 2: Get the data from entity and features.
>>> fs.get_data(entity='accounts', features={'Jan': fp.process_id})
accounts Jan
0 Alpha Co 200.0
1 Blue Inc 50.0
2 Jones LLC 150.0
3 Orange Inc NaN
4 Yellow Inc NaN
5 Red Inc 150.0
# Example 3: Get the data from dataset name.
>>> dc = DatasetCatalog(repo=repo, data_domain=data_domain)
>>> dc.build_dataset(entity='accounts',
... selected_features={'Jan': fp.process_id,
... 'Feb': fp.process_id},
... view_name='test_get_data',
... description='Dataset with Jan and Feb')
>>> fs.get_data(dataset_name='test_get_data')
accounts Feb Jan
0 Alpha Co 210.0 200.0
1 Blue Inc 90.0 50.0
2 Jones LLC 200.0 150.0
3 Orange Inc 210.0 NaN
4 Yellow Inc 90.0 NaN
5 Red Inc 200.0 150.0
# Example 4: Get the data from Entity and Features, where entity
# object and feature objects passed to the entity and
# features arguments.
>>> # Create features.
>>> feature1 = Feature('sales:Mar',
... df.Mar,
... feature_type=FeatureType.CATEGORICAL)
>>> feature2 = Feature('sales:Apr',
... df.Apr,
... feature_type=FeatureType.CONTINUOUS)
>>> # Create entity.
>>> entity = Entity(name='accounts_entity', columns=['accounts'])
>>> fp1 = FeatureProcess(repo=repo,
... data_domain=data_domain,
... object=df,
... entity=entity,
... features=[feature1, feature2])
>>> fp1.run()
Process '5522c034-684d-11f0-99c5-a30631e77953' started.
Process '5522c034-684d-11f0-99c5-a30631e77953' completed.
True
>>> fs.get_data(entity=entity, features={feature1.name: fp1.process_id,
... feature2.name: fp1.process_id})
accounts sales:Mar sales:Apr
0 Alpha Co 215.0 250.0
1 Blue Inc 95.0 101.0
2 Jones LLC 140.0 180.0
3 Orange Inc NaN 250.0
4 Yellow Inc NaN NaN
5 Red Inc 140.0 NaN
# Example 5: Get the data for the time passed by the user via the as_of argument.
>>> import time
>>> from datetime import datetime as dt, date as d
# Retrieve the record where accounts == 'Blue Inc'.
>>> df_test = df[df['accounts'] == 'Blue Inc']
>>> df_test
Feb Jan Mar Apr datetime
accounts
Blue Inc 90.0 50.0 95.0 101.0 04/01/2017
# Example updates the data. Hence, creating a new table to avoid modifying the existing tables data.
>>> df_test.to_sql('sales_test', if_exists='replace')
>>> test_df = DataFrame('sales_test')
>>> test_df
accounts Feb Jan Mar Apr datetime
0 Blue Inc 90.0 50 95 101 17/01/04
>>> # Create a feature process.
>>> fp = FeatureProcess(repo=repo,
... data_domain=data_domain,
... object=test_df,
... entity='accounts',
... features=['Jan', 'Feb'])
>>> # Run the feature process
>>> fp.run()
Process '6cb49b4b-79d4-11f0-8c5e-b0dcef8381ea' started.
Process '6cb49b4b-79d4-11f0-8c5e-b0dcef8381ea' completed.
True
>>> # Running the same process more than once to demonstrate how user can
>>> # retrieve specific version of Features using argument 'as_of'.
>>> # Wait for 20 seconds. Then update the data. Then run again.
>>> time.sleep(20)
>>> execute_sql("update sales_test set Jan = Jan * 10, Feb = Feb * 10")
TeradataCursor uRowsHandle=269 bClosed=False
>>> # Run the feature process again.
>>> fp.run()
Process '6cb49b4b-79d4-11f0-8c5e-b0dcef8381ea' started.
Process '6cb49b4b-79d4-11f0-8c5e-b0dcef8381ea' completed.
True
>>> # Then again wait for 20 seconds. Then update the data. Then run again.
>>> time.sleep(20)
>>> execute_sql("update sales_test set Jan = Jan * 10, Feb = Feb * 10")
TeradataCursor uRowsHandle=397 bClosed=False
>>> # Run the feature process again.
>>> fp.run()
Process '6cb49b4b-79d4-11f0-8c5e-b0dcef8381ea' started.
Process '6cb49b4b-79d4-11f0-8c5e-b0dcef8381ea' completed.
True
# Retrieve specific version of Features at '2025-08-15 12:37:23'
>>> as_of_time = dt(2025, 8, 15, 12, 37, 23)
>>> # time passed to as_of in datetime.datetime format.
>>> fs.get_data(process_id=fp.process_id,
... as_of=as_of_time)
accounts Feb Jan
0 Blue Inc 900.0 500
>>> # time passed to as_of in string format.
>>> fs.get_data(process_id=fp.process_id,
... as_of=as_of_time.strftime('%Y-%m-%d %H:%M:%S'))
accounts Feb Jan
0 Blue Inc 900.0 500
# Example 6: Get the data for the time passed by the user via the as_of argument
# by sourcing entity and features.
>>> # time passed to as_of in datetime.datetime format.
>>> fs.get_data(entity='accounts',
... features={'Feb': fp.process_id,
... 'Jan': fp.process_id},
... as_of=as_of_time)
accounts Feb Jan
0 Blue Inc 900.0 500
>>> # time passed to as_of in string format.
>>> fs.get_data(entity='accounts',
... features={'Feb': fp.process_id,
... 'Jan': fp.process_id},
... as_of=as_of_time.strftime('%Y-%m-%d %H:%M:%S'))
accounts Feb Jan
0 Blue Inc 900.0 500
# Example 7: Get the latest data for the given process_id.
>>> fs.get_data(process_id=fp.process_id, include_historic_records=False)
accounts Feb Jan
0 Blue Inc 9000.0 5000
# Example 8: Get the historic data for the given process_id.
>>> fs.get_data(process_id=fp.process_id, include_historic_records=True)
accounts Feb Jan
0 Blue Inc 9000.0 5000
1 Blue Inc 90.0 50
2 Blue Inc 90.0 5000
3 Blue Inc 900.0 500
4 Blue Inc 900.0 5000
5 Blue Inc 900.0 50
6 Blue Inc 90.0 500
7 Blue Inc 9000.0 50
8 Blue Inc 9000.0 500
# Example 9: Get the latest data for the given feature.
>>> fs.get_data(entity='accounts', features={'Feb': fp.process_id}, include_historic_records=False)
accounts Feb
0 Blue Inc 9000.0
# Example 10: Get the historic data for the given feature.
>>> fs.get_data(entity='accounts', features={'Feb': fp.process_id}, include_historic_records=True)
accounts Feb
0 Blue Inc 900.0
1 Blue Inc 90.0
2 Blue Inc 9000.0