1.0 - 8.00 - Forest_Predict Example - Teradata Vantage

Teradata® Vantage Machine Learning Engine Analytic Function Reference

Product
Teradata Vantage
Release Number
1.0
8.00
Release Date
May 2019
Content Type
Programming Reference
Publication ID
B700-4003-098K
Language
English (United States)

Input

This example uses these input tables:
Input table: housing_test
sn price lotsize bedrooms bathrms stories driveway recroom fullbase gashw airco garagepl prefarea homestyle
13 27000 1700 3 1 2 yes no no no no 0 no Classic
16 37900 3185 2 1 1 yes no no no yes 0 no Classic
25 42000 4960 2 1 1 yes no no no no 0 no Classic
38 67000 5170 3 1 4 yes no no no yes 0 no Eclectic
53 68000 9166 2 1 1 yes no yes no yes 2 no Eclectic
104 132000 3500 4 2 2 yes no no yes no 2 no bungalow
111 43000 5076 3 1 1 no no no no no 0 no Classic
117 93000 3760 3 1 2 yes no no yes no 2 no Eclectic
132 44500 3850 3 1 2 yes no no no no 0 no Classic
140 43000 3750 3 1 2 yes no no no no 0 no Classic
142 40000 2650 3 1 2 yes no yes no no 1 no Classic
157 60000 2953 3 1 2 yes no yes no yes 0 no Eclectic
... ... ... ... ... ... ... ... ... ... ... ... ... ...

SQL Call

Use the Accumulate argument to pass the homestyle variable, to easily compare the actual and predicted response for each observation.

CREATE MULTISET TABLE rf_housing_predict AS (
  SELECT * FROM Forest_Predict@coprocessor (
    ON housing_test PARTITION BY ANY
    ON rft_model AS ModelTable DIMENSION
    USING
    NumericInputs ('price ', 'lotsize ', 'bedrooms ', 'bathrms ',
                   'stories ', 'garagepl')
    CategoricalInputs ('driveway ', 'recroom ', 'fullbase ', 'gashw ',
                       'airco ', 'prefarea')
    IDColumn ('sn')
    Accumulate ('homestyle')
    Detailed ('false')
  ) AS dt
) WITH DATA;

Output

The predicted response is in the prediction column and the original classification values are in the homestyle column. The upper and lower confidence intervals are also shown in the output table.

This query returns the following table:

SELECT * FROM rf_housing_predict ORDER BY 2;
rf_housing_predict
homestyle sn prediction confidence_lower confidence_upper
Classic 13 Classic 0.6 0.6
Classic 16 Classic 0.56 0.56
Classic 25 Classic 0.54 0.54
Eclectic 38 Eclectic 0.7 0.7
Eclectic 53 Eclectic 0.54 0.54
bungalow 104 bungalow 0.36 0.36
Classic 111 Classic 0.54 0.54
Eclectic 117 Eclectic 0.46 0.46
Classic 132 Classic 0.54 0.54
Classic 140 Classic 0.52 0.52
Classic 142 Eclectic 0.5 0.5
Eclectic 157 Eclectic 0.64 0.64
Eclectic 161 Eclectic 0.74 0.74
bungalow 162 Eclectic 0.46 0.46
Eclectic 176 Eclectic 0.48 0.48
Eclectic 177 Eclectic 0.56 0.56
Classic 195 Classic 0.76 0.76
Classic 198 Classic 0.48 0.48
Eclectic 224 Eclectic 0.56 0.56
Classic 234 Classic 0.64 0.64
Classic 237 Classic 0.48 0.48
Classic 239 Classic 0.52 0.52
Classic 249 Classic 0.7 0.7
Classic 251 Classic 0.6 0.6
Eclectic 254 Eclectic 0.66 0.66
Eclectic 255 Eclectic 0.6 0.6
Classic 260 Eclectic 0.5 0.5
Eclectic 274 Eclectic 0.66 0.66
Classic 294 Classic 0.62 0.62
Eclectic 301 Classic 0.56 0.56
Eclectic 306 Eclectic 0.7 0.7
Eclectic 317 Eclectic 0.5 0.5
bungalow 329 Eclectic 0.52 0.52
bungalow 339 bungalow 0.56 0.56
Eclectic 340 Eclectic 0.54 0.54
Eclectic 353 Eclectic 0.44 0.44
Eclectic 355 Classic 0.4 0.4
Eclectic 364 Eclectic 0.54 0.54
bungalow 367 bungalow 0.52 0.52
bungalow 377 Eclectic 0.46 0.46
Eclectic 401 Eclectic 0.56 0.56
Eclectic 403 Eclectic 0.56 0.56
Eclectic 408 Eclectic 0.56 0.56
Eclectic 411 Eclectic 0.54 0.54
Eclectic 440 Eclectic 0.66 0.66
Eclectic 441 Classic 0.5 0.5
Eclectic 443 Classic 0.52 0.52
Classic 459 Classic 0.74 0.74
Classic 463 Eclectic 0.56 0.56
Eclectic 469 Eclectic 0.62 0.62
Eclectic 472 Eclectic 0.54 0.54
bungalow 527 Eclectic 0.52 0.52
bungalow 530 Eclectic 0.58 0.58
Eclectic 540 Eclectic 0.42 0.42

Prediction Accuracy

This query returns the prediction accuracy:

SELECT (SELECT count(sn) FROM rf_housing_predict
  WHERE homestyle = prediction) / (SELECT count(sn)
    FROM rf_housing_predict) AS PA;
pa
0.77777777777777777778