Normalizing the Input Variables - Aster Analytics

Teradata AsterĀ® Analytics Foundation User GuideUpdate 2

Product
Aster Analytics
Release Number
7.00.02
Published
September 2017
Language
English (United States)
Last Update
2018-04-17
dita:mapPath
uce1497542673292.ditamap
dita:ditavalPath
AA-notempfilter_pdf_output.ditaval
dita:id
B700-1022
lifecycle
previous
Product Category
Software

The reason to normalize the input variables before inputting them to the PCA function is that if some variables have much larger variance than others, they dominate in the first few principal components.

This query normalizes the input variables by subtracting the mean and then dividing by the standard deviation for each input variable in the same patient_pca_input data set, using the Scale and ScaleMap functions:

CREATE DIMENSION TABLE pca_scaled AS 
SELECT * FROM Scale (
  ON patient_pca_input AS INPUT PARTITION BY ANY
  ON (SELECT * FROM ScaleMap (
    ON patient_pca_input
    InputColumns ('[1:8]')
    MissValue ('omit'))
  ) AS statistic DIMENSION
  Method ('std')
  Accumulate ('pid')
) ORDER BY pid;