7.00.02 - Normalizing the Input Variables - Aster Analytics

Teradata Aster® Analytics Foundation User GuideUpdate 2

Product
Aster Analytics
Release Number
7.00.02
Release Date
September 2017
Content Type
Programming Reference
User Guide
Publication ID
B700-1022-700K
Language
English (United States)

The reason to normalize the input variables before inputting them to the PCA function is that if some variables have much larger variance than others, they dominate in the first few principal components.

This query normalizes the input variables by subtracting the mean and then dividing by the standard deviation for each input variable in the same patient_pca_input data set, using the Scale and ScaleMap functions:

CREATE DIMENSION TABLE pca_scaled AS 
SELECT * FROM Scale (
  ON patient_pca_input AS INPUT PARTITION BY ANY
  ON (SELECT * FROM ScaleMap (
    ON patient_pca_input
    InputColumns ('[1:8]')
    MissValue ('omit'))
  ) AS statistic DIMENSION
  Method ('std')
  Accumulate ('pid')
) ORDER BY pid;