1.1 - 8.10 - Scale Example: ScaleMethod ('midrange') - Teradata Vantage

Teradata Vantage™ - Machine Learning Engine Analytic Function Reference

Product
Teradata Vantage
Release Number
1.1
8.10
Release Date
October 2019
Content Type
Programming Reference
Publication ID
B700-4003-079K
Language
English (United States)

This example scales (normalizes) input data using the midrange method and the default values for the syntax elements Intercept and Multiplier (0 and 1, respectively).

Input

  • InputTable: scale_housing, which has data about houses

    The type column has categorical data; the other columns have numerical data. The id column identifies the rows. The table has some NULL values.

  • Statistic: created by calling the ScaleMap function inside the Scale call
InputTable: scale_housing
type id price lotsize bedrooms bathrms stories
classic 1 42000 5850 3 1 2
classic 2   4000 2 1 1
classic 3 49500 3060 3 1 1
classic 4 60500 6650 3 1 2
classic 5 61000 6360 2 1 1
bungalow 6 66000 4160 2 1 1
bungalow 7 66000 3880   2 2
bungalow 8 69000 4160 3 1 3
bungalow 9 83800 4800 3 1 1
bungalow 10 88500 5500 3 2 4

SQL Call

SELECT * FROM Scale (
  ON ScaleMap (
  ON scale_housing
  USING
  TargetColumns ('[2:6]')
  MissValue ('omit')
  ) AS statistic DIMENSION
  ON scale_housing AS InputTable PARTITION BY ANY
  USING
  ScaleMethod ('midrange')
  Accumulate ('id')
) AS dt ORDER BY id,price,lotsize;

Output

The values in the Output Table are calculated as follows. As shown in the descriptions of the syntax elements Multiplier and Intercept, the formula for computing the scaled value X' from the input value X is:

X' = intercept + multiplier * (X - location)/scale

In this example, ScaleMethod is “midrange”, so location and scale are calculated as shown (see Location and Scale for Statistical Methods):

  • location = (maxX+minX)/2
  • scale = (maxX-minX)/2

In the equations above, maxX and minX are the maximum and minimum values of column X respectively.

For example, consider row 1 of the price column in InputTable and the following output table:
  • The value in the input table is 49500.
  • In this example, intercept and multiplier take their default values, 0 and 1 respectively.
  • maxX and minX are the maximum and minimum values in the Price column of the input table. These values are 88500 and 42000 respectively.
  • Location and scale are calculated using the equations shown above:
    • location = (88500+42000)/2=65250
    • scale = (88500-42000)/2=23250
  • The scaled value shown in the output table is calculated as follows:

    X'=0 + 1 * (42000-65250)/23250 = -0.67741935

 id price                lotsize              bedrooms bathrms stories             
 -- -------------------- -------------------- -------- ------- ------------------- 
  1                 -1.0   0.5543175487465181      1.0    -1.0 -0.3333333333333333
  3  -0.6774193548387096                 -1.0      1.0    -1.0                -1.0
  4 -0.20430107526881722                  1.0      1.0    -1.0 -0.3333333333333333
  5  -0.1827956989247312   0.8384401114206128     -1.0    -1.0                -1.0
  6  0.03225806451612903  -0.3871866295264624      1.0    -1.0                -1.0
  8  0.16129032258064516  -0.3871866295264624      1.0    -1.0  0.3333333333333333
  9   0.7978494623655914 -0.03064066852367688      1.0    -1.0                -1.0
 10                  1.0   0.3593314763231198      1.0     1.0                 1.0

Download a zip file of all examples and a SQL script file that creates their input tables from the attachment in the left sidebar.