TD_ScaleTransform Function | ScaleTransform | Teradata Vantage - TD_ScaleTransform - Teradata Vantage

Teradata® VantageCloud Lake

Deployment
VantageCloud
Edition
Lake
Product
Teradata Vantage
Published
January 2023
Language
English (United States)
Last Update
2024-04-03
dita:mapPath
phg1621910019905.ditamap
dita:ditavalPath
pny1626732985837.ditaval
dita:id
phg1621910019905

TD_ScaleTransform scales specified input table columns, using TD_ScaleFit output. TD_ScaleTransform accepts input data in dense and sparse format.

ScaleFitTransform is a data preprocessing technique that applies various scaling methods to transform the variables in a dataset to have a specific scale or range of values suitable for analysis or modeling. The technique involves fitting the scaler on a training dataset and then using the same scaler to transform the variables in the test dataset.

The ScaleFitTransform technique involves two main steps:
  1. Fit the scaler: The scaler is fitted on the training dataset to learn the scaling parameters, such as the minimum and maximum values or mean and standard deviation, depending on the chosen scaling method.
  2. Transform the data: Use the fitted scaler to transform the variables in both the training and test datasets. This ensures that the same scaling is applied to both datasets, preventing any bias or inconsistencies in the analysis or modeling.

Some commonly used scaling methods in ScaleFitTransform include Min-Max scaling, Standardization, Logarithmic scaling, Power transformation, and Robust scaling. The choice of scaling method depends on the nature of the data and the analysis or modeling goals.

Overall, ScaleFitTransform is an essential preprocessing step that can improve the performance of machine learning models and reduce bias in statistical analysis

Usage Considerations

The following are usage considerations for TD_ScaleTransform function:

ON clause

  • The InputTable in the TD_ScaleTransform query can have no partition at all or have the following combinations of PARTITION BY/ORDER BY clauses.
    • PARTITION BY ANY ORDER BY
    • PARTITION BY ANY
    • PARTITION BY KEY ORDER BY
    • PARTITION BY KEY
  • FitTable must be a DIMENSION table with an optional ORDER BY clause or can have PARTITION BY KEY.
  • None of the arguments are mandatory for TD_ScaleTransform function.
  • If the InputTable uses PARTITION BY ANY syntax, but FitTable is not specified as DIMENSION, then an error is reported. For example: *** Failure 9850 Invalid input to table operator: There can be only one input with no partitioning attributes / PARTITION BY ANY/ LOCAL ORDER BY clause.

Limits and Restrictions

  • This function requires the UTF8 client character set for UNICODE data.
  • This function does not support Pass Through Characters (PTCs) and KanjiSJIS or Graphic data types.