GLMPredict_MLE Example: Gaussian Distribution Prediction | Teradata Vantage - GLMPredict_MLE Example: Gaussian Distribution Prediction - Teradata Vantage

Machine Learning Engine Analytic Function Reference

Product
Teradata Vantage
Release Number
9.02
9.01
2.0
1.3
Published
February 2022
Language
English (United States)
Last Update
2022-02-10
dita:mapPath
rnn1580259159235.ditamap
dita:ditavalPath
ybt1582220416951.ditaval
dita:id
B700-4003
lifecycle
previous
Product Category
Teradata Vantageā„¢

The example uses the Gaussian model created in GLM Example: Gaussian Distribution Analysis to predict house prices. To evaluate the accuracy of the model, the example calculates the root mean square error (RMSE) between the known actual price from the input table and the price predicted by the model.

Input

housing_test
sn price lotsize bedrooms bathrms stories driveway recroom fullbase gashw airco garagepl prefarea homestyle
13 27000 1700 3 1 2 yes no no no no 0 no Classic
16 37900 3185 2 1 1 yes no no no yes 0 no Classic
25 42000 4960 2 1 1 yes no no no no 0 no Classic
38 67000 5170 3 1 4 yes no no no yes 0 no Eclectic
53 68000 9166 2 1 1 yes no yes no yes 2 no Eclectic
104 132000 3500 4 2 2 yes no no yes no 2 no bungalow
111 43000 5076 3 1 1 no no no no no 0 no Classic
117 93000 3760 3 1 2 yes no no yes no 2 no Eclectic
132 44500 3850 3 1 2 yes no no no no 0 no Classic
140 43000 3750 3 1 2 yes no no no no 0 no Classic
142 40000 2650 3 1 2 yes no yes no no 1 no Classic
157 60000 2953 3 1 2 yes no yes no yes 0 no Eclectic
161 63900 3162 3 1 2 yes no no no yes 1 no Eclectic
... ... ... ... ... ... ... ... ... ... ... ... ... ...

SQL Call

The canonical link specifies the default family link, which is "identity" for the Gaussian distribution.

CREATE MULTISET TABLE glmpredict_housing AS (
  SELECT * FROM GLMPredict_MLE (
    ON housing_test PARTITION BY ANY
    ON glm_housing_model AS Model DIMENSION
    USING
    Accumulate ('sn', 'price')
    Family ('GAUSSIAN')
    LinkFunction ('CANONICAL')
  ) AS dt
) WITH DATA;

Output

The fitted_value column gives the predicted home price.

SELECT * FROM glmpredict_housing ORDER BY 1;
 sn  price    fitted_value       
 --- -------- ------------------ 
  13  27000.0  37345.84477329254
  16  37900.0  43687.13245987892
  25  42000.0  40902.02870941162
  38  67000.0  72487.67201280594
  53  68000.0  79238.69493055344
 104 132000.0 111528.00744915009
 111  43000.0 39102.882046699524
 117  93000.0  66936.95215988159
 132  44500.0  41819.88732004166
 140  43000.0  41611.79231786728
 142  40000.0  44394.14731836319
 157  60000.0  66571.26562905312
 161  63900.0  64900.98411035538
 162 130000.0  107759.1224937439
 176  57500.0  73438.73871564865
 177  70000.0  62378.35326194763
 195  33000.0   37197.9376707077
 198  40500.0  47308.08242368698
 224  78500.0  67232.86958527565
 234  32500.0  35237.16528749466
 237  43000.0  46593.47125959396
 239  26000.0  43377.86666679382
 249  44500.0  37863.84167766571
 251  48500.0  45096.38719415665
 254  60000.0  74665.70030021667
 255  61000.0  60214.16523933411
 260  41000.0  43066.21673202515
 274  64900.0  67232.45221805573
 294  47000.0  38987.55468940735
 301  55000.0   55621.6930809021
 306  64000.0  67339.69788479805
 317  80000.0    65655.120470047
 329 115442.0 123612.01978111267
 339 141000.0 126282.07774448395
 340  62500.0 58474.835211753845
 353  78500.0  67485.69113445282
 355  86900.0  68425.80433177948
 364  72000.0  77422.12543773651
 367 114000.0 128556.01284217834
 377 140000.0 127201.90244436264
 401  92500.0  84040.80987596512
 403  77500.0  79857.25416207314
 408  87500.0  76218.38956928253
 411  90000.0   78179.1003665924
 440  69000.0  80549.23930311203
 441  51900.0 64670.294174194336
 443  65000.0  61704.09422302246
 459  44555.0 42818.367908000946
 463  49000.0  49293.44947862625
 469  55000.0  61779.35452270508
 472  60500.0   63767.0579059124
 527 105000.0 119762.26224088669
 530 108000.0 116119.24969100952
 540  85000.0  73146.08736228943

RMSE

This query returns the following table:

SELECT SQRT(AVG(POWER(glmpredict_housing.price -
glmpredict_housing.fitted_value, 2))) AS RMSE FROM glmpredict_housing;
 RMSE               
 ------------------ 
 10246.752127962065

Download a zip file of all examples and a SQL script file that creates their input tables.