This example scales (normalizes) input data using the midrange method and the default values for the arguments Intercept and Multiplier (0 and 1, respectively).
Input
- input: scale_housing, which has data about houses
The types column has categorical data; the other columns have numerical data. The id column identifies the rows. The table has some NULL values.
- statistic: created by calling the ScaleMap function inside the Scale call
types | id | price | lotsize | bedrooms | bathrms | stories |
---|---|---|---|---|---|---|
classic | 1 | 42000 | 5850 | 3 | 1 | 2 |
classic | 2 | 4000 | 2 | 1 | 1 | |
classic | 3 | 49500 | 3060 | 3 | 1 | 1 |
classic | 4 | 60500 | 6650 | 3 | 1 | 2 |
classic | 5 | 61000 | 6360 | 2 | 1 | 1 |
bungalow | 6 | 66000 | 4160 | 2 | 1 | 1 |
bungalow | 7 | 66000 | 3880 | 2 | 2 | |
bungalow | 8 | 69000 | 4160 | 3 | 1 | 3 |
bungalow | 9 | 83800 | 4800 | 3 | 1 | 1 |
bungalow | 10 | 88500 | 5500 | 3 | 2 | 4 |
SQL Call
SELECT * FROM Scale ( ON ScaleMap ( ON scale_housing USING TargetColumns ('[2:6]') MissValue ('omit') ) AS statistic DIMENSION ON scale_housing AS "input" PARTITION BY ANY USING ScaleMethod ('midrange') Accumulate ('id') ) AS dt ORDER BY id, price, lotsize;
Output
The output table has the midrange-scaled values for the input data set. As explained in the descriptions of the arguments Multiplier and Intercept, this is the formula for computing the scaled value X' from the input value X:
X' = intercept + multiplier * (X - location)/scale
These are the formulas for computing location and scale for the midrange method (from Scale Arguments):
- location = (maxX+minX)/2
- scale = (maxX-minX)/2
The values minX and maxX are the minimum and maximum values of X, respectively.
For example, consider row 1 of the price column in the input table and the following output table:
Item | Value |
---|---|
intercept | 0 (default) |
multiplier | 0 (default) |
Input value X | 42000 |
Minimum input price value minX | 42000 |
Maximum input price value maxX | 88500 |
location | (88500+42000)/2 = 65250 |
scale | (88500-42000)/2 = 23250 |
Scaled output value X' | 0 + 1 * (42000-65250)/23250 = -1 |
id | price | lotsize | bedrooms | bathrms | stories |
---|---|---|---|---|---|
1 | -1 | 0.554317548746518 | 1 | -1 | -0.333333333333333 |
3 | -0.67741935483871 | 1 | 1 | -1 | -1 |
4 | -0.204301075268817 | 1 | 1 | -1 | 0.333333333333333 |
5 | -0.182795698924731 | 0.838440111420613 | -1 | -1 | -1 |
6 | 0.032258064516129 | -0.387186629526462 | 1 | -1 | -1 |
8 | 0.161290322580645 | -0.387186629526462 | 1 | -1 | -0.333333333333333 |
9 | 0.797849462365591 | -0.0306406685236769 | 1 | -1 | -1 |
10 | 1 | 0.35933147632312 | 1 | 1 | 1 |
1 | -1 | 0.554317548746518 | 1 | -1 | -0.333333333333333 |
3 | -0.67741935483871 | 1 | 1 | -1 | -1 |