build_time_series() | DatasetCatalog Method | Teradata Package for Python - build_time_series() - Teradata Package for Python

Teradata® Package for Python User Guide

Deployment
VantageCloud
VantageCore
Edition
VMware
Enterprise
IntelliFlex
Product
Teradata Package for Python
Release Number
20.00
Published
March 2025
ft:locale
en-US
ft:lastEdition
2025-12-05
dita:mapPath
nvi1706202040305.ditamap
dita:ditavalPath
plt1683835213376.ditaval
dita:id
rkb1531260709148
Product Category
Teradata Vantage

Use the build_time_series() method to build the dataset with start time and end time for feature values available in the feature catalog. Once the dataset is created, you can create a teradataml DataFrame on the dataset.

Required Parameters

entity
Specifies the name of the Entity or object Entity to be included in the dataset.
selected_features
Specifies the names of features and the corresponding feature version to be included in the dataset.

Key is the name of the feature and value is the version of the feature. Refer to FeatureCatalog.list_feature_versions() to get the list of features and their versions.

view_name
Specifies the name of the view to be created for the dataset.

Optional Parameters

description
Specifies the description for the dataset.
include_historic_records
Specifies whether to include historic data in the dataset.

Default value: False

Example setup

Ingest sales data to the feature catalog configured for repo 'vfs_v1'.

>>> from teradataml import load_example_data, FeatureProcess
>>> load_example_data('dataframe', 'sales')
>>> df = DataFrame("sales")
>>> df
              Feb    Jan    Mar    Apr    datetime
accounts
Red Inc     200.0  150.0  140.0    NaN  04/01/2017
Blue Inc     90.0   50.0   95.0  101.0  04/01/2017
Alpha Co    210.0  200.0  215.0  250.0  04/01/2017
Orange Inc  210.0    NaN    NaN  250.0  04/01/2017
Yellow Inc   90.0    NaN    NaN    NaN  04/01/2017
Jones LLC   200.0  150.0  140.0  180.0  04/01/2017

Create a feature store.

>>> from teradataml import FeatureStore
>>> fs = FeatureStore(repo='vfs_v1', data_domain='sales')
Repo vfs_v1 does not exist. Run FeatureStore.setup() to create the repo and setup FeatureStore.

Set up the feature store for this repository.

>>> fs.setup()
True

Initiate FeatureProcess to ingest features.

>>> fp = FeatureProcess(repo='vfs_v1', data_domain='sales', object=df, entity='accounts', features=['Jan', 'Feb', 'Mar', 'Apr'])

Run the feature process.

>>> fp.run()
Process 'a9f29a4e-3f75-11f0-b43b-f020ff57c62c' started.
Process 'a9f29a4e-3f75-11f0-b43b-f020ff57c62c' completed.

Example 1: Build dataset with features 'Jan', 'Feb' from repo 'vfs_v1' and sales data domain

Name the dataset as 'ds_jan_feb'.

>>> from teradataml import DatasetCatalog
>>> dc = DatasetCatalog(repo='vfs_v1', data_domain='sales')
>>> dataset = dc.build_time_series(entity='accounts',
...                                selected_features = {
...                                    'Jan': 'a9f29a4e-3f75-11f0-b43b-f020ff57c62c',
...                                    'Feb': 'a9f29a4e-3f75-11f0-b43b-f020ff57c62c'},
...                                view_name='ds_jan_feb',
...                                description='Dataset with Jan and Feb features')
>>> dataset
   accounts    Jan                  Jan_start_time                    Jan_end_time    Feb                  Feb_start_time                    Feb_end_time
0    Blue Inc   50.0  2025-06-20 12:17:14.040000+00:  9999-12-31 23:59:59.999999+00:   90.0  2025-06-20 12:17:14.040000+00:  9999-12-31 23:59:59.999999+00:
1     Red Inc  150.0  2025-06-20 12:17:14.040000+00:  9999-12-31 23:59:59.999999+00:  200.0  2025-06-20 12:17:14.040000+00:  9999-12-31 23:59:59.999999+00:
2  Yellow Inc    NaN  2025-06-20 12:17:14.040000+00:  9999-12-31 23:59:59.999999+00:   90.0  2025-06-20 12:17:14.040000+00:  9999-12-31 23:59:59.999999+00:
3    Alpha Co  200.0  2025-06-20 12:17:14.040000+00:  9999-12-31 23:59:59.999999+00:  210.0  2025-06-20 12:17:14.040000+00:  9999-12-31 23:59:59.999999+00:
4   Jones LLC  150.0  2025-06-20 12:17:14.040000+00:  9999-12-31 23:59:59.999999+00:  200.0  2025-06-20 12:17:14.040000+00:  9999-12-31 23:59:59.999999+00:
5  Orange Inc    NaN  2025-06-20 12:17:14.040000+00:  9999-12-31 23:59:59.999999+00:  210.0  2025-06-20 12:17:14.040000+00:  9999-12-31 23:59:59.999999+00:

Example 2: Build dataset with features 'f_int', 'f_float' from repo 'vfs_v1' and 'sales' data domain

Build a time series dataset by ingesting the same feature multiple times with updated values to show how the feature values are being transformed over a period of time.

>>> import time
>>> from datetime import datetime as dt, date as d

Retrieve the record where accounts == 'Blue Inc'.

>>> df_test = df[df['accounts'] == 'Blue Inc']
>>> df_test
            Feb    Jan    Mar    Apr    datetime
accounts
Blue Inc     90.0   50.0   95.0  101.0  04/01/2017

Writes record stored in a teradataml DataFrame to the database.

>>> df_test.to_sql('sales_test', if_exists='replace')
>>> test_df = DataFrame('sales_test')
>>> test_df
 accounts   Feb  Jan  Mar  Apr  datetime
0  Blue Inc  90.0   50   95  101  17/01/04

Create a feature process.

>>> fp = FeatureProcess(repo=repo,
...                     data_domain=data_domain,
...                     object=test_df,
...                     entity='accounts',
...                     features=['Jan', 'Feb'])

Run the feature process.

>>> fp.run()
Process '6cb49b4b-79d4-11f0-8c5e-b0dcef8381ea' started.
Process '6cb49b4b-79d4-11f0-8c5e-b0dcef8381ea' completed.
True
This example steps through the same sequence several times to demonstrate how you can retrieve specific feature versions using as_of.
  • Wait 20 seconds.
  • Update the data.
  • Run the feature process.
>>> time.sleep(20)
>>> execute_sql("update sales_test set Jan = Jan * 10, Feb = Feb * 10")
TeradataCursor uRowsHandle=269 bClosed=False
>>> fp.run()
Process '6cb49b4b-79d4-11f0-8c5e-b0dcef8381ea' started.
Process '6cb49b4b-79d4-11f0-8c5e-b0dcef8381ea' completed.
True
>>> time.sleep(20)
>>> execute_sql("update sales_test set Jan = Jan * 10, Feb = Feb * 10")
TeradataCursor uRowsHandle=397 bClosed=False
>>> fp.run()
Process '6cb49b4b-79d4-11f0-8c5e-b0dcef8381ea' started.
Process '6cb49b4b-79d4-11f0-8c5e-b0dcef8381ea' completed.
True

Build the time series dataset with features 'Feb', 'Jan' by excluding the historic records from repo 'vfs_v1' and 'sales' data domain.

>>> dc = DatasetCatalog(repo='vfs_v1', data_domain='sales')
>>> exclude_history = dc.build_time_series(entity='accounts',
...                                        selected_features={'Feb': fp.process_id,
...                                                           'Jan': fp.process_id},
...                                        view_name='exclude_history',
...                                        include_historic_records=False)
>>> exclude_history
 accounts     Feb                  Feb_start_time                    Feb_end_time   Jan                  Jan_start_time                    Jan_end_time
0  Blue Inc  9000.0  2025-08-15 13:24:58.140000+00:  9999-12-31 23:59:59.999999+00:  5000  2025-08-15 13:24:58.140000+00:  9999-12-31 23:59:59.999999+00:
>>> dc = DatasetCatalog(repo='vfs_v1', data_domain='sales')
>>> include_history = dc.build_time_series(entity='accounts',
...                                        selected_features={'Feb': fp.process_id,
...                                                           'Jan': fp.process_id},
...                                        view_name='include_history',
...                                        include_historic_records=True)
>>> include_history
 accounts     Feb                  Feb_start_time                    Feb_end_time   Jan                  Jan_start_time                    Jan_end_time
0  Blue Inc  9000.0  2025-08-15 13:24:58.140000+00:  9999-12-31 23:59:59.999999+00:  5000  2025-08-15 13:24:58.140000+00:  9999-12-31 23:59:59.999999+00:
1  Blue Inc    90.0  2025-08-15 13:23:41.780000+00:  2025-08-15 13:24:31.320000+00:    50  2025-08-15 13:23:41.780000+00:  2025-08-15 13:24:31.320000+00:
2  Blue Inc    90.0  2025-08-15 13:23:41.780000+00:  2025-08-15 13:24:31.320000+00:  5000  2025-08-15 13:24:58.140000+00:  9999-12-31 23:59:59.999999+00:
3  Blue Inc   900.0  2025-08-15 13:24:31.320000+00:  2025-08-15 13:24:58.140000+00:   500  2025-08-15 13:24:31.320000+00:  2025-08-15 13:24:58.140000+00:
4  Blue Inc   900.0  2025-08-15 13:24:31.320000+00:  2025-08-15 13:24:58.140000+00:  5000  2025-08-15 13:24:58.140000+00:  9999-12-31 23:59:59.999999+00:
5  Blue Inc   900.0  2025-08-15 13:24:31.320000+00:  2025-08-15 13:24:58.140000+00:    50  2025-08-15 13:23:41.780000+00:  2025-08-15 13:24:31.320000+00:
6  Blue Inc    90.0  2025-08-15 13:23:41.780000+00:  2025-08-15 13:24:31.320000+00:   500  2025-08-15 13:24:31.320000+00:  2025-08-15 13:24:58.140000+00:
7  Blue Inc  9000.0  2025-08-15 13:24:58.140000+00:  9999-12-31 23:59:59.999999+00:    50  2025-08-15 13:23:41.780000+00:  2025-08-15 13:24:31.320000+00:
8  Blue Inc  9000.0  2025-08-15 13:24:58.140000+00:  9999-12-31 23:59:59.999999+00:   500  2025-08-15 13:24:31.320000+00:  2025-08-15 13:24:58.140000+00: