Feature Store Components | Teradata Package for Python - Feature Store Components - Teradata Package for Python

Teradata® Package for Python User Guide

Deployment
VantageCloud
VantageCore
Edition
Enterprise
IntelliFlex
VMware
Product
Teradata Package for Python
Release Number
20.00
Published
December 2024
ft:locale
en-US
ft:lastEdition
2025-01-23
dita:mapPath
nvi1706202040305.ditamap
dita:ditavalPath
plt1683835213376.ditaval
dita:id
rkb1531260709148
Product Category
Teradata Vantage
The Feature store contains the following components:
  • Feature
  • Entity
  • Data Source
  • Feature Group
  • Feature Store

Feature

The Feature component is synonomous with features used in machine learning (ML) models. This component stores all the details of a Feature, and is required for building an ML model. The following details can be stored in Feature:
  • Name of column from the table containing the data
  • Data type (e.g., string, number, date)
  • Whether it is of type Categorical or Continuous

You can also add a description and tags.

Entity

Every dataset holds one or more unique identifiers for each row in the dataset. For example, a weekly retail store sales holds week number to identify details for a week, and a local telephone number along with area code holds uniqueness for telephone number. If you observe, the dataset in first example has one entity to represent unique record, and the dataset for the second example includes two entities to represent the unique record.

Entity in FeatureStore serves the same purpose; it is identified with a name, and stores the column names that identifies the uniqueness of data.

Data Source

This component stores the source of the data. Teradata EFS always refers to a SQL as a data source. You can identify a data source with a name and it holds the corresponding SQL as data source.

The Data Source component also stores information about the name of the column to indicate when the corresponding record is created. The Recorded Time column in the following example of a patient profile contains the date and time at which the patient readings are taken.

Patient ID Recorded Time Pregnancies Age BMI
61 2024-04-10 11:10:59.000000 8 39 33
0 2024-04-11 11:10:58.000000 6 50 34
40 2024-04-13 11:10:58.000000 3 26 34
99 2024-04-12 11:10:55.000000 1 31 50

Data Source can also store this column information along with the source. This information is helpful when you want to generate a model on historic data. For example, you can get a dataset for the patients who took the test between 10th April and 12th April and feed it to your ML model.

Feature Group

Identified with a name, it logically combines Feature, Entity, and Data Source. You can form a group with at least one Feature, one Entity, and one Data Source. Each component that comprises a feature group can be part of multiple Feature Groups.

You can always add or remove Features, Entities, and Data Sources from a Feature Group.

Since you store the details of Features, Entity, and DataSource in a Feature Group, you can always access the details of individual components from the Feature Group. For example, you can access the description of Data Source from Feature Group.

While generating an ML model, you may need to generate an ML model from two or more different groups. In such cases, you can combine different groups to form a new group. Then use the new group to feed your ML model.

Combining multiple Feature Groups creates a unique Feature Group that includes the Features and Data Source which has data for all of the combined Features. See Combining one or more Feature Groups into a single FeatureGroup for prerequisites and an example.

Feature Store

The Feature Store component stores Features, Entities, Data Sources, and Feature Groups in a database called repo.

You can retrieve any of these components from Feature Store and use them in your ML model.

Prerequisites for using Feature Store: