Use the datasets property to get the list of dataset objects associated with the data domain.
Example setup
Load the data to be used.
>>> from teradataml import load_example_data, DataFrame
>>> load_example_data('dataframe', ['sales', 'admissions_train'])
>>> df = DataFrame('sales')
>>> admission_df = DataFrame('admissions_train')
Create the repository and data domain.
>>> repo = 'vfs_test' >>> data_domain = 'sales'
Create a feature store.
>>> from teradataml import FeatureStore >>> fs = FeatureStore(repo=repo, data_domain=data_domain)
Repo vfs_test does not exist. Run FeatureStore.setup() to create the repo and setup FeatureStore.
Set up the feature store for this repository.
>>> fs.setup()
True
Run FeatureProcess to ingest features.
>>> from teradataml import FeatureProcess >>> fp = FeatureProcess(repo=repo, ... data_domain=data_domain, ... object=df, ... entity='accounts', ... features=['Jan', 'Feb', 'Mar', 'Apr'])
>>> fp.run()
Process '4098c3ea-6c8d-11f0-837a-24eb16d15109' started. Process '4098c3ea-6c8d-11f0-837a-24eb16d15109' completed.
Build the dataset.
>>> from teradataml import DatasetCatalog
>>> dataset_catalog = DatasetCatalog(repo=repo, data_domain=data_domain)
>>> dataset_catalog.build_dataset(entity='accounts',
... selected_features={
... 'Jan': fp.process_id,
... 'Feb': fp.process_id,
... 'Mar': fp.process_id},
... view_name='dd_test_view',
... description='DataDomain Test')
accounts Jan Feb Mar 0 Yellow Inc NaN 90.0 NaN 1 Alpha Co 200.0 210.0 215.0 2 Jones LLC 150.0 200.0 140.0 3 Blue Inc 50.0 90.0 95.0 4 Orange Inc NaN 210.0 NaN 5 Red Inc 150.0 200.0 140.0
Run another FeatureProcess to ingest features into a different DataFrame.
>>> fp2 = FeatureProcess(repo=repo, ... data_domain=data_domain, ... object=admission_df, ... entity='id', ... features=['masters', 'gpa', 'stats', 'admitted'])
>>> fp2.run()
Process '5129c4eb-7d9e-21f1-947b-35fb27e26210' started. Process '5129c4eb-7d9e-21f1-947b-35fb27e26210' completed.
Build the dataset.
>>> dataset_catalog.build_dataset(entity='id',
... selected_features={
... 'master': fp2.process_id,
... 'stats': fp2.process_id,
... 'gpa': fp2.process_id}
... view_name='dd_test_view_2',
... description='DataDomain Test 2')
id masters gpa stats 0 13 no 4.00 Advanced 1 36 no 3.00 Advanced 2 15 yes 4.00 Advanced 3 38 yes 2.65 Advanced 4 5 no 3.44 Novice 5 40 yes 3.95 Novice 6 7 yes 2.33 Novice 7 22 yes 3.46 Novice 8 26 yes 3.57 Advanced 9 19 yes 1.98 Advanced
Example: Get the datasets in the data domain
Create a data domain object.
>>> from teradataml import DataDomain >>> dd = DataDomain(repo=repo, ... data_domain=data_domain)
List the datasets.
>>> dd.datasets
[Dataset(repo=vfs_test, id=f4450459-6f17-4155-8b8a-01764701e4aa, data_domain=sales), Dataset(repo=vfs_test, id=75533508-74f4-40ca-bd63-8ea6710d9700, data_domain=sales)]