This example scales (normalizes) input data using the midrange method and the default values for the syntax elements Intercept and Multiplier (0 and 1, respectively).
Input
- InputTable: scale_housing, which has data about houses
The type column has categorical data; the other columns have numerical data. The id column identifies the rows. The table has some NULL values.
- Statistic: created by calling the ScaleMap function inside the Scale call
type | id | price | lotsize | bedrooms | bathrms | stories |
---|---|---|---|---|---|---|
classic | 1 | 42000 | 5850 | 3 | 1 | 2 |
classic | 2 | 4000 | 2 | 1 | 1 | |
classic | 3 | 49500 | 3060 | 3 | 1 | 1 |
classic | 4 | 60500 | 6650 | 3 | 1 | 2 |
classic | 5 | 61000 | 6360 | 2 | 1 | 1 |
bungalow | 6 | 66000 | 4160 | 2 | 1 | 1 |
bungalow | 7 | 66000 | 3880 | 2 | 2 | |
bungalow | 8 | 69000 | 4160 | 3 | 1 | 3 |
bungalow | 9 | 83800 | 4800 | 3 | 1 | 1 |
bungalow | 10 | 88500 | 5500 | 3 | 2 | 4 |
SQL Call
SELECT * FROM Scale ( ON ScaleMap ( ON scale_housing USING TargetColumns ('[2:6]') MissValue ('omit') ) AS statistic DIMENSION ON scale_housing AS InputTable PARTITION BY ANY USING ScaleMethod ('midrange') Accumulate ('id') ) AS dt ORDER BY id,price,lotsize;
Output
The values in the Output Table are calculated as follows. As shown in the descriptions of the syntax elements Multiplier and Intercept, the formula for computing the scaled value X' from the input value X is:
X' = intercept + multiplier * (X - location)/scale
In this example, ScaleMethod is “midrange”, so location and scale are calculated as shown (see Location and Scale for Statistical Methods):
- location = (maxX+minX)/2
- scale = (maxX-minX)/2
In the equations above, maxX and minX are the maximum and minimum values of column X respectively.
- The value in the input table is 49500.
- In this example, intercept and multiplier take their default values, 0 and 1 respectively.
- maxX and minX are the maximum and minimum values in the Price column of the input table. These values are 88500 and 42000 respectively.
- Location and scale are calculated using the equations shown above:
- location = (88500+42000)/2=65250
- scale = (88500-42000)/2=23250
- The scaled value shown in the output table is calculated as follows:
X'=0 + 1 * (42000-65250)/23250 = -0.67741935
id price lotsize bedrooms bathrms stories -- -------------------- -------------------- -------- ------- ------------------- 1 -1.0 0.5543175487465181 1.0 -1.0 -0.3333333333333333 3 -0.6774193548387096 -1.0 1.0 -1.0 -1.0 4 -0.20430107526881722 1.0 1.0 -1.0 -0.3333333333333333 5 -0.1827956989247312 0.8384401114206128 -1.0 -1.0 -1.0 6 0.03225806451612903 -0.3871866295264624 1.0 -1.0 -1.0 8 0.16129032258064516 -0.3871866295264624 1.0 -1.0 0.3333333333333333 9 0.7978494623655914 -0.03064066852367688 1.0 -1.0 -1.0 10 1.0 0.3593314763231198 1.0 1.0 1.0
Download a zip file of all examples and a SQL script file that creates their input tables from the attachment in the left sidebar.