datasets | DataDomain Property | Teradata Package for Python - datasets - Teradata Package for Python

Teradata® Package for Python User Guide

Deployment
VantageCloud
VantageCore
Edition
VMware
Enterprise
IntelliFlex
Product
Teradata Package for Python
Release Number
20.00
Published
March 2025
ft:locale
en-US
ft:lastEdition
2025-12-05
dita:mapPath
nvi1706202040305.ditamap
dita:ditavalPath
plt1683835213376.ditaval
dita:id
rkb1531260709148
Product Category
Teradata Vantage

Use the datasets property to get the list of dataset objects associated with the data domain.

Example setup

Load the data to be used.

>>> from teradataml import load_example_data, DataFrame
>>> load_example_data('dataframe', ['sales', 'admissions_train'])
>>> df = DataFrame('sales')
>>> admission_df = DataFrame('admissions_train')

Create the repository and data domain.

>>> repo = 'vfs_test'
>>> data_domain = 'sales'

Create a feature store.

>>> from teradataml import FeatureStore
>>> fs = FeatureStore(repo=repo, data_domain=data_domain)
Repo vfs_test does not exist. Run FeatureStore.setup() to create the repo and setup FeatureStore.

Set up the feature store for this repository.

>>> fs.setup()
True

Run FeatureProcess to ingest features.

>>> from teradataml import FeatureProcess
>>> fp = FeatureProcess(repo=repo,
...                     data_domain=data_domain,
...                     object=df,
...                     entity='accounts',
...                     features=['Jan', 'Feb', 'Mar', 'Apr'])
>>> fp.run()
Process '4098c3ea-6c8d-11f0-837a-24eb16d15109' started.
Process '4098c3ea-6c8d-11f0-837a-24eb16d15109' completed.

Build the dataset.

>>> from teradataml import DatasetCatalog
>>> dataset_catalog = DatasetCatalog(repo=repo, data_domain=data_domain)
>>> dataset_catalog.build_dataset(entity='accounts',
...                               selected_features={
...                                   'Jan': fp.process_id,
...                                   'Feb': fp.process_id,
...                                   'Mar': fp.process_id},
...                               view_name='dd_test_view',
...                               description='DataDomain Test')
     accounts    Jan    Feb    Mar
0  Yellow Inc    NaN   90.0    NaN
1    Alpha Co  200.0  210.0  215.0
2   Jones LLC  150.0  200.0  140.0
3    Blue Inc   50.0   90.0   95.0
4  Orange Inc    NaN  210.0    NaN
5     Red Inc  150.0  200.0  140.0

Run another FeatureProcess to ingest features into a different DataFrame.

>>> fp2 = FeatureProcess(repo=repo,
...                      data_domain=data_domain,
...                      object=admission_df,
...                      entity='id',
...                      features=['masters', 'gpa', 'stats', 'admitted'])
>>> fp2.run()
Process '5129c4eb-7d9e-21f1-947b-35fb27e26210' started.
Process '5129c4eb-7d9e-21f1-947b-35fb27e26210' completed.

Build the dataset.

>>> dataset_catalog.build_dataset(entity='id',
...                               selected_features={
...                                   'master': fp2.process_id,
...                                   'stats': fp2.process_id,
...                                   'gpa': fp2.process_id}
...                               view_name='dd_test_view_2',
...                               description='DataDomain Test 2')
   id masters   gpa     stats
0  13      no  4.00  Advanced
1  36      no  3.00  Advanced
2  15     yes  4.00  Advanced
3  38     yes  2.65  Advanced
4   5      no  3.44    Novice
5  40     yes  3.95    Novice
6   7     yes  2.33    Novice
7  22     yes  3.46    Novice
8  26     yes  3.57  Advanced
9  19     yes  1.98  Advanced

Example: Get the datasets in the data domain

Create a data domain object.

>>> from teradataml import DataDomain
>>> dd = DataDomain(repo=repo,
...                 data_domain=data_domain)

List the datasets.

>>> dd.datasets
[Dataset(repo=vfs_test, id=f4450459-6f17-4155-8b8a-01764701e4aa, data_domain=sales),
 Dataset(repo=vfs_test, id=75533508-74f4-40ca-bd63-8ea6710d9700, data_domain=sales)]