The mind_map() method is an interactive visual representation of your FeatureStore ecosystem. It transforms complex data relationships into clear, actionable visualizations that benefit both technical teams and business stakeholders.
mind_map() works only in Jupyter Notebook.
mind_map() addresses several critical challenges in feature engineering and data science workflows:
- Complex relationship visualization: As your FeatureStore grows, understanding the relationships between data sources, features, and datasets becomes increasingly complex. The mind map provides a clear visual representation of these connections.
- Data lineage tracking: Quickly trace how data flows from source tables through feature processes to final datasets, enabling better data governance and debugging.
- Impact analysis: Before making changes to data sources or features, visualize which downstream components will be affected.
- Feature discovery: Easily discover existing features and datasets to avoid duplication and promote reuse.
Example setup
Ingest sales data to feature catalog configured for repo 'vfs_v1'.
>>> from teradataml import load_example_data, FeatureProcess
>>> load_example_data('dataframe', 'sales')
>>> df = DataFrame("sales")
>>> df
Feb Jan Mar Apr datetime accounts Red Inc 200.0 150.0 140.0 NaN 04/01/2017 Blue Inc 90.0 50.0 95.0 101.0 04/01/2017 Alpha Co 210.0 200.0 215.0 250.0 04/01/2017 Orange Inc 210.0 NaN NaN 250.0 04/01/2017 Yellow Inc 90.0 NaN NaN NaN 04/01/2017 Jones LLC 200.0 150.0 140.0 180.0 04/01/2017
Create a FeatureStore.
>>> from teradataml import FeatureStore >>> fs = FeatureStore(repo='test', data_domain='sales')
repo test does not exist. Run FeatureStore.setup() to create the repo and setup FeatureStore.
>>> fs.setup()
True
Initiate FeatureProcess to ingest features.
>>> fp = FeatureProcess(repo='test', data_domain='sales', object=df, entity='accounts', features=['Jan', 'Feb', 'Mar', 'Apr'])
Run the feature process.
>>> fp.run()
Process 'a9f29a4e-3f75-11f0-b43b-f020ff57c62c' started. Process 'a9f29a4e-3f75-11f0-b43b-f020ff57c62c' completed.
Example: Build dataset with features 'Jan', 'Feb' from repo 'test' and sales data domain
Name the dataset as 'ds_jan_feb'.
>>> from teradataml import DatasetCatalog
>>> dc = DatasetCatalog(repo='test', data_domain='sales')
>>> dataset = dc.build_dataset(entity='accounts',
... selected_features = {
... 'Jan': 'a9f29a4e-3f75-11f0-b43b-f020ff57c62c',
... 'Feb': 'a9f29a4e-3f75-11f0-b43b-f020ff57c62c'},
... view_name='ds_jan_feb',
... description='Dataset with Jan and Feb features')
>>> dataset
accounts Jan Feb 0 Blue Inc 50.0 90.0 1 Alpha Co 200.0 210.0 2 Yellow Inc NaN 90.0 3 Orange Inc NaN 210.0 4 Jones LLC 150.0 200.0 5 Red Inc 150.0 200.0
Generate the mind map. This displays an interactive visualization.
>>> fs.mind_map()