Scale Example 1: ScaleMethod ('midrange') - Teradata Vantage

Machine Learning Engine Analytic Function Reference

Product
Teradata Vantage
Release Number
8.00
1.0
Published
May 2019
Language
English (United States)
Last Update
2019-11-22
dita:mapPath
blj1506016597986.ditamap
dita:ditavalPath
blj1506016597986.ditaval
dita:id
B700-4003
lifecycle
previous
Product Category
Teradata Vantageā„¢

This example scales (normalizes) input data using the midrange method and the default values for the arguments Intercept and Multiplier (0 and 1, respectively).

Input

  • input: scale_housing, which has data about houses

    The types column has categorical data; the other columns have numerical data. The id column identifies the rows. The table has some NULL values.

  • statistic: created by calling the ScaleMap function inside the Scale call
input: scale_housing
types id price lotsize bedrooms bathrms stories
classic 1 42000 5850 3 1 2
classic 2   4000 2 1 1
classic 3 49500 3060 3 1 1
classic 4 60500 6650 3 1 2
classic 5 61000 6360 2 1 1
bungalow 6 66000 4160 2 1 1
bungalow 7 66000 3880   2 2
bungalow 8 69000 4160 3 1 3
bungalow 9 83800 4800 3 1 1
bungalow 10 88500 5500 3 2 4

SQL Call

SELECT * FROM Scale (
  ON ScaleMap (
    ON scale_housing
    USING
    TargetColumns ('[2:6]')
    MissValue ('omit')
  ) AS statistic DIMENSION
  ON scale_housing AS "input" PARTITION BY ANY
  USING
  ScaleMethod ('midrange')
  Accumulate ('id')
) AS dt ORDER BY id, price, lotsize;

Output

The output table has the midrange-scaled values for the input data set. As explained in the descriptions of the arguments Multiplier and Intercept, this is the formula for computing the scaled value X' from the input value X:

X' = intercept + multiplier * (X - location)/scale

These are the formulas for computing location and scale for the midrange method (from Scale Arguments):

  • location = (maxX+minX)/2
  • scale = (maxX-minX)/2

The values minX and maxX are the minimum and maximum values of X, respectively.

For example, consider row 1 of the price column in the input table and the following output table:

Item Value
intercept 0 (default)
multiplier 0 (default)
Input value X 42000
Minimum input price value minX 42000
Maximum input price value maxX 88500
location (88500+42000)/2 = 65250
scale (88500-42000)/2 = 23250
Scaled output value X' 0 + 1 * (42000-65250)/23250 = -1
Output Table
id price lotsize bedrooms bathrms stories
1 -1 0.554317548746518 1 -1 -0.333333333333333
3 -0.67741935483871 1 1 -1 -1
4 -0.204301075268817 1 1 -1 0.333333333333333
5 -0.182795698924731 0.838440111420613 -1 -1 -1
6 0.032258064516129 -0.387186629526462 1 -1 -1
8 0.161290322580645 -0.387186629526462 1 -1 -0.333333333333333
9 0.797849462365591 -0.0306406685236769 1 -1 -1
10 1 0.35933147632312 1 1 1
1 -1 0.554317548746518 1 -1 -0.333333333333333
3 -0.67741935483871 1 1 -1 -1