TD_ScaleTransform Function | ScaleTransform | Teradata Vantage - TD_ScaleTransform - Analytics Database

Database Analytic Functions

Deployment
VantageCloud
VantageCore
Edition
Enterprise
IntelliFlex
VMware
Product
Analytics Database
Release Number
17.20
Published
June 2022
Language
English (United States)
Last Update
2024-04-06
dita:mapPath
gjn1627595495337.ditamap
dita:ditavalPath
ayr1485454803741.ditaval
dita:id
jmh1512506877710
Product Category
Teradata Vantageā„¢

TD_ScaleTransform scales specified input table columns, using TD_ScaleFit output. TD_ScaleTransform accepts input data in dense and sparse format.

ScaleFitTransform is a data preprocessing technique that applies various scaling methods to transform the variables in a dataset to have a specific scale or range of values suitable for analysis or modeling. The technique involves fitting the scaler on a training dataset and then using the same scaler to transform the variables in the test dataset.

The ScaleFitTransform technique involves two main steps:
  1. Fit the scaler: The scaler is fitted on the training dataset to learn the scaling parameters, such as the minimum and maximum values or mean and standard deviation, depending on the chosen scaling method.
  2. Transform the data: Use the fitted scaler to transform the variables in both the training and test datasets. This ensures that the same scaling is applied to both datasets, preventing any bias or inconsistencies in the analysis or modeling.

Some commonly used scaling methods in ScaleFitTransform include Min-Max scaling, Standardization, Logarithmic scaling, Power transformation, and Robust scaling. The choice of scaling method depends on the nature of the data and the analysis or modeling goals.

Overall, ScaleFitTransform is an essential preprocessing step that can improve the performance of machine learning models and reduce bias in statistical analysis

Usage Considerations

The following are usage considerations for TD_ScaleTransform function:

ON clause

  • The InputTable in the TD_ScaleTransform query can have no partition at all or have the following combinations of PARTITION BY/ORDER BY clauses.
    • PARTITION BY ANY ORDER BY
    • PARTITION BY ANY
    • PARTITION BY KEY ORDER BY
    • PARTITION BY KEY
  • FitTable must be a DIMENSION table with an optional ORDER BY clause or can have PARTITION BY KEY.
  • None of the arguments are mandatory for TD_ScaleTransform function.
  • If the InputTable uses PARTITION BY ANY syntax, but FitTable is not specified as DIMENSION, then an error is reported. For example: *** Failure 9850 Invalid input to table operator: There can be only one input with no partitioning attributes / PARTITION BY ANY/ LOCAL ORDER BY clause.

Limits and Restrictions

  • This function requires the UTF8 client character set for UNICODE data.
  • This function does not support Pass Through Characters (PTCs) and KanjiSJIS or Graphic data types.