Feature
| Definition | A measurable property or characteristic of the input dataset that serves as an input to machine learning functions. |
| Teradata Component | teradataml.Feature, a class that represents Feature |
| Examples |
|
Feature Value
| Definition | The actual value of a feature for a specific instance or data point. |
| Teradata Context | Feature Values are the actual data stored in feature columns within temporal tables. |
| Examples |
|
Entity
| Definition | A collection of semantically related features that represents a business object. |
| Teradata Component | teradataml.Entity, a class that represents Entity. |
| Examples |
|
Data Source
| Definition | The source of data that feeds into the ML model and feature store. |
| Teradata Context | Data Source can be a table, view, or any SQL query that provides data for feature extraction. |
| Teradata Component | teradataml.DataSource, a class that represents DataSource. |
| Types of Data Sources |
|
Feature Group
| Definition | A logical collection of features that share the same entity and are stored in the same source, enabling organized feature management and reusability. |
| Teradata Context | Feature Group acts as a logical representation that groups related features together for easier management and ensures point-in-time correctness. |
| Teradata Component | teradataml.FeatureGroup, a class that represents FeatureGroup. |
Feature Catalog
| Definition | A centralized registry that organizes and tracks all feature values available for ML workflows. |
| Teradata Component | teradataml.FeatureCatalog, a class that represents Feature catalog. |
| Catalog Capabilities |
|
Feature Process
| Definition | A workflow process, identified by a unique process ID, that computes feature values from data sources and ingests them into the feature catalog. |
| Teradata Context | Feature Process ingests the feature values from an ETL pipeline and stores them in the feature catalog. The ETL pipeline is built using teradataml DataFrame operations, which are stored by creating a view on the final DataFrame. |
| Teradata Component | teradataml.FeatureProcess, a class that represents Feature process. |
Process Catalog
| Definition | A centralized registry that stores details of feature processes executed over time to provide audit trail and process management capabilities. |
| Teradata Context | Process Catalog is a table that maintains the history of all feature processing workflows. |
Dataset
| Definition | A structured collection of data created by selecting and organizing specific feature values related to an entity, serving as a snapshot for analysis or model training. |
| Teradata Context | Dataset is created by joining different feature values with corresponding entities, providing a consolidated view for ML workflows. |
| Teradata Component | teradataml.Dataset, a class that represents dataset. |
Dataset Catalog
| Definition | A centralized repository that provides metadata and details about datasets created within the feature store, enabling dataset discovery and management. |
| Teradata Component | teradataml.DatasetCatalog, a class that represents dataset catalogs. |
Repository (Repo)
| Definition | A physical namespace that isolates and organizes feature definitions, entities, feature groups, and catalogs, allowing multiple teams to work independently. |
| Teradata Context | Repository is implemented as a database or schema that contains all feature store objects. |
Data Domain
| Definition | A logical grouping inside a repository that organizes data, entities, and features within a specific business context, preventing naming conflicts and ensuring clarity. |
| Teradata Context | Data Domain acts as a namespace within a repository to organize features by business area or use case. |
| Examples | Banking System
|