get_dataset() | DatasetCatalog Method | Teradata Package for Python - get_dataset() - Teradata Package for Python

Teradata® Package for Python User Guide

Deployment
VantageCloud
VantageCore
Edition
VMware
Enterprise
IntelliFlex
Product
Teradata Package for Python
Release Number
20.00
Published
March 2025
ft:locale
en-US
ft:lastEdition
2025-12-05
dita:mapPath
nvi1706202040305.ditamap
dita:ditavalPath
plt1683835213376.ditaval
dita:id
rkb1531260709148
Product Category
Teradata Vantage

Use the get_dataset() method to get the dataset object from the given dataset id.

Required Parameter

id
Specifies the dataset id to retrieve the dataset from.

Example setup

Upload features first to create a dataset.

>>> from teradataml import load_example_data, FeatureProcess
>>> load_example_data('dataframe', 'sales')
>>> df = DataFrame("sales")

Create a feature store.

>>> fs = FeatureStore(repo='vfs_v1', data_domain='sales')
Repo vfs_v1 does not exist. Run FeatureStore.setup() to create the repo and setup FeatureStore.

Set up FeatureStore for this repository.

>>> fs.setup()
True

Run FeatureProcess to ingest features.

>>> fp = FeatureProcess(repo='vfs_v1', data_domain='sales', object=df, entity='accounts',
...                     features=['Jan', 'Feb', 'Mar', 'Apr'])
>>> fp.run()
Process '3acf5632-5d73-11f0-99c5-a30631e77953' started.
Process '3acf5632-5d73-11f0-99c5-a30631e77953' completed.

Build a dataset.

>>> from teradataml import DatasetCatalog
>>> dc = DatasetCatalog(repo='vfs_v1', data_domain='sales')
>>> dataset = dc.build_dataset(entity='accounts',
...                            selected_features = {
...                                        'Jan': fp.process_id,
...                                        'Feb': fp.process_id},
...                            view_name='ds_jan_feb',
...                            description='Dataset with Jan and Feb features')

List the datasets.

>>> dc.list_datasets()
                                     data_domain          name entity_name                            description                       valid_start                       valid_end
id
851a651a-68a3-4eb6-b606-df2617089068       sales  ds_jan_feb_1    accounts      Dataset with Jan and Feb features    2025-07-10 09:50:25.527852+00:  9999-12-31 23:59:59.999999+00:

Example: Get the dataset

>>> ds = dc.get_dataset('851a651a-68a3-4eb6-b606-df2617089068')
>>> ds
Dataset(repo=vfs_v1, id=851a651a-68a3-4eb6-b606-df2617089068, data_domain=sales)