Use the DatasetCatalog class to manage datasets within the Enterprise Feature Store. It provides functionality to create, list, retrieve, archive, and delete datasets, as well as manage dataset-related operations.
Syntax
DatasetCatalog(repo, data_domain=None)
Required Parameter
- repo
- Specifies the name of the database where the feature store is set up.
Optional Parameter
- data_domain
- Specifies the name of the data domain to refer for managing datasets. If not specified, then default database is used as the data domain.
Example setup
Upload features.
>>> from teradataml import load_example_data, FeatureProcess
>>> load_example_data('dataframe', 'sales')
>>> df = DataFrame("sales")
Create a feature store.
>>> fs = FeatureStore(repo='vfs_v1', data_domain='sales')
repo vfs_v1 does not exist. Run FeatureStore.setup() to create the repo and setup FeatureStore.
Set up the feature store for this repository.
>>> fs.setup()
True
Run FeatureProcess to ingest features.
>>> fp = FeatureProcess(repo='vfs_v1', data_domain='sales', object=df, entity='accounts', ... features=['Jan', 'Feb', 'Mar', 'Apr'])
Example: Create a DatasetCatalog instance
>>> from teradataml import DatasetCatalog >>> dc = DatasetCatalog(repo='vfs_v1', data_domain='sales')