DecisionForestPredict Function Example | Teradata Vantage - 17.05 - DecisionForestPredict Example - Teradata Database

Teradata Vantage™ - Advanced SQL Engine Analytic Functions

prodname
Advanced SQL Engine
Teradata Database
vrm_release
17.00
17.05
created_date
June 2020
category
Programming Reference
featnum
B035-1206-170K

Input

  • Input table: housing_test, which has 54 observations of 14 variables
  • Model: rft_model, output by "DecisionForest Example: TreeType ('classification') and OutOfBag ('false')" in Teradata Vantage™ Machine Learning Engine Analytic Function Reference, B700-4003
Input Table Column Descriptions
Column Description
sn Sale number (unique identifier of observation)
price Sale price in U. S. dollars (numeric)
lotsize Lot size in square feet (numeric)
bedrooms Number of bedrooms (numeric)
bathrms Number of full bathrooms (numeric)
stories Number of stories, excluding basement (numeric)
driveway Whether the house has a driveway—yes or no (categorical)
recroom Whether the house has a recreation room—yes or no (categorical)
fullbase Whether the house has a full finished basement—yes or no (categorical)
gashw Whether the house uses gas to heat water—yes or no (categorical)
airco Whether the house has central air conditioning—yes or no (categorical)
garagepl Number of garage places (numeric)
prefarea Whether the house is in a preferred neighborhood—yes or no (categorical)
homestyle Style of house (response variable)
housing_test
sn price lotsize bedrooms bathrms stories driveway recroom fullbase gashw airco garagepl prefarea homestyle
13 27000 1700 3 1 2 yes no no no no 0 no Classic
16 37900 3185 2 1 1 yes no no no yes 0 no Classic
25 42000 4960 2 1 1 yes no no no no 0 no Classic
38 67000 5170 3 1 4 yes no no no yes 0 no Eclectic
53 68000 9166 2 1 1 yes no yes no yes 2 no Eclectic
104 132000 3500 4 2 2 yes no no yes no 2 no bungalow
111 43000 5076 3 1 1 no no no no no 0 no Classic
117 93000 3760 3 1 2 yes no no yes no 2 no Eclectic
132 44500 3850 3 1 2 yes no no no no 0 no Classic
140 43000 3750 3 1 2 yes no no no no 0 no Classic
142 40000 2650 3 1 2 yes no yes no no 1 no Classic
157 60000 2953 3 1 2 yes no yes no yes 0 no Eclectic
... ... ... ... ... ... ... ... ... ... ... ... ... ...
rtf_model
worker_ip task_index tree_num CAST(tree AS VARCHAR(50))
xx.xx.xx.xx 0 0 {"responseCounts_":{"Eclectic":148,"bungalow":30,"
xx.xx.xx.xx 0 1 {"responseCounts_":{"Eclectic":158,"bungalow":26,"
xx.xx.xx.xx 0 2 {"responseCounts_":{"Eclectic":120,"bungalow":38,"
xx.xx.xx.xx 0 3 {"responseCounts_":{"Eclectic":166,"bungalow":29,"
xx.xx.xx.xx 0 4 {"responseCounts_":{"Eclectic":138,"bungalow":32,"
xx.xx.xx.xx 0 5 {"responseCounts_":{"Eclectic":158,"bungalow":34,"
xx.xx.xx.xx 0 6 {"responseCounts_":{"Eclectic":168,"bungalow":32,"
xx.xx.xx.xx 0 7 {"responseCounts_":{"Eclectic":145,"bungalow":40,"
xx.xx.xx.xx 0 8 {"responseCounts_":{"Eclectic":150,"bungalow":34,"
xx.xx.xx.xx 0 9 {"responseCounts_":{"Eclectic":156,"bungalow":42,"
xx.xx.xx.xx 0 10 {"responseCounts_":{"Eclectic":148,"bungalow":18,"
xx.xx.xx.xx 0 11 {"responseCounts_":{"Eclectic":147,"bungalow":20,"
xx.xx.xx.xx 0 12 {"responseCounts_":{"Eclectic":150,"bungalow":31,"
xx.xx.xx.xx 0 13 {"responseCounts_":{"Eclectic":135,"bungalow":32,"
xx.xx.xx.xx 0 14 {"responseCounts_":{"Eclectic":139,"bungalow":24,"
xx.xx.xx.xx 0 15 {"responseCounts_":{"Eclectic":146,"bungalow":27,"
xx.xx.xx.xx 0 16 {"responseCounts_":{"Eclectic":152,"bungalow":23,"
xx.xx.xx.xx 0 17 {"responseCounts_":{"Eclectic":135,"bungalow":23,"
xx.xx.xx.xx 0 18 {"responseCounts_":{"Eclectic":148,"bungalow":29,"
xx.xx.xx.xx 0 19 {"responseCounts_":{"Eclectic":166,"bungalow":33,"
xx.xx.xx.xx 0 20 {"responseCounts_":{"Eclectic":142,"bungalow":28,"
xx.xx.xx.xx 0 21 {"responseCounts_":{"Eclectic":172,"bungalow":27,"
xx.xx.xx.xx 0 22 {"responseCounts_":{"Eclectic":147,"bungalow":37,"
xx.xx.xx.xx 0 23 {"responseCounts_":{"Eclectic":158,"bungalow":31,"
xx.xx.xx.xx 0 24 {"responseCounts_":{"Eclectic":158,"bungalow":33,"
xx.xx.xx.xx 1 0 {"responseCounts_":{"Eclectic":140,"bungalow":44,"
xx.xx.xx.xx 1 1 {"responseCounts_":{"Eclectic":161,"bungalow":28,"
xx.xx.xx.xx 1 2 {"responseCounts_":{"Eclectic":131,"bungalow":25,"
xx.xx.xx.xx 1 3 {"responseCounts_":{"Eclectic":167,"bungalow":28,"
xx.xx.xx.xx 1 4 {"responseCounts_":{"Eclectic":150,"bungalow":19,"
xx.xx.xx.xx 1 5 {"responseCounts_":{"Eclectic":158,"bungalow":24,"
xx.xx.xx.xx 1 6 {"responseCounts_":{"Eclectic":177,"bungalow":32,"
xx.xx.xx.xx 1 7 {"responseCounts_":{"Eclectic":156,"bungalow":24,"
xx.xx.xx.xx 1 8 {"responseCounts_":{"Eclectic":156,"bungalow":37,"
xx.xx.xx.xx 1 9 {"responseCounts_":{"Eclectic":165,"bungalow":24,"
xx.xx.xx.xx 1 10 {"responseCounts_":{"Eclectic":135,"bungalow":29,"
xx.xx.xx.xx 1 11 {"responseCounts_":{"Eclectic":140,"bungalow":20,"
xx.xx.xx.xx 1 12 {"responseCounts_":{"Eclectic":156,"bungalow":24,"
xx.xx.xx.xx 1 13 {"responseCounts_":{"Eclectic":147,"bungalow":34,"
xx.xx.xx.xx 1 14 {"responseCounts_":{"Eclectic":151,"bungalow":22,"
xx.xx.xx.xx 1 15 {"responseCounts_":{"Eclectic":161,"bungalow":18,"
xx.xx.xx.xx 1 16 {"responseCounts_":{"Eclectic":156,"bungalow":19,"
xx.xx.xx.xx 1 17 {"responseCounts_":{"Eclectic":126,"bungalow":29,"
xx.xx.xx.xx 1 18 {"responseCounts_":{"Eclectic":148,"bungalow":26,"
xx.xx.xx.xx 1 19 {"responseCounts_":{"Eclectic":177,"bungalow":21,"
xx.xx.xx.xx 1 20 {"responseCounts_":{"Eclectic":137,"bungalow":31,"
xx.xx.xx.xx 1 21 {"responseCounts_":{"Eclectic":171,"bungalow":28,"
xx.xx.xx.xx 1 22 {"responseCounts_":{"Eclectic":146,"bungalow":30,"
xx.xx.xx.xx 1 23 {"responseCounts_":{"Eclectic":149,"bungalow":21,"
xx.xx.xx.xx 1 24 {"responseCounts_":{"Eclectic":158,"bungalow":18,"

SQL Call

Use the Accumulate syntax element to pass the homestyle variable, to easily compare the actual and predicted response for each observation.

CREATE MULTISET TABLE rf_housing_predict AS (
  SELECT * FROM DecisionForestPredict (
    ON housing_test PARTITION BY ANY
    ON rft_model AS Model DIMENSION
    USING
    NumericInputs ('price ','lotsize ','bedrooms ','bathrms ',
                   'stories ','garagepl')
    CategoricalInputs ('driveway ','recroom ','fullbase ','gashw ',
                       'airco ','prefarea')
    IdColumn ('sn')
    Accumulate ('homestyle')
    Detailed ('false')
  ) AS dt
) WITH DATA;

Output

This query returns the following table:

SELECT * FROM rf_housing_predict ORDER BY 2;
homestyle sn prediction confidence_lower confidence_upper
Classic 13 Classic 0.6 0.6
Classic 16 Classic 0.56 0.56
Classic 25 Classic 0.54 0.54
Eclectic 38 Eclectic 0.7 0.7
Eclectic 53 Eclectic 0.54 0.54
bungalow 104 bungalow 0.36 0.36
Classic 111 Classic 0.54 0.54
Eclectic 117 Eclectic 0.46 0.46
Classic 132 Classic 0.54 0.54
Classic 140 Classic 0.52 0.52
Classic 142 Eclectic 0.5 0.5
Eclectic 157 Eclectic 0.64 0.64
Eclectic 161 Eclectic 0.74 0.74
bungalow 162 Eclectic 0.46 0.46
Eclectic 176 Eclectic 0.48 0.48
Eclectic 177 Eclectic 0.56 0.56
Classic 195 Classic 0.76 0.76
Classic 198 Classic 0.48 0.48
Eclectic 224 Eclectic 0.56 0.56
Classic 234 Classic 0.64 0.64
Classic 237 Classic 0.48 0.48
Classic 239 Classic 0.52 0.52
Classic 249 Classic 0.7 0.7
Classic 251 Classic 0.6 0.6
Eclectic 254 Eclectic 0.66 0.66
Eclectic 255 Eclectic 0.6 0.6
Classic 260 Eclectic 0.5 0.5
Eclectic 274 Eclectic 0.66 0.66
Classic 294 Classic 0.62 0.62
Eclectic 301 Classic 0.56 0.56
Eclectic 306 Eclectic 0.7 0.7
Eclectic 317 Eclectic 0.5 0.5
bungalow 329 Eclectic 0.52 0.52
bungalow 339 bungalow 0.56 0.56
Eclectic 340 Eclectic 0.54 0.54
Eclectic 353 Eclectic 0.44 0.44
Eclectic 355 Classic 0.4 0.4
Eclectic 364 Eclectic 0.54 0.54
bungalow 367 bungalow 0.52 0.52
bungalow 377 Eclectic 0.46 0.46
Eclectic 401 Eclectic 0.56 0.56
Eclectic 403 Eclectic 0.56 0.56
Eclectic 408 Eclectic 0.56 0.56
Eclectic 411 Eclectic 0.54 0.54
Eclectic 440 Eclectic 0.66 0.66
Eclectic 441 Classic 0.5 0.5
Eclectic 443 Classic 0.52 0.52
Classic 459 Classic 0.74 0.74
Classic 463 Eclectic 0.56 0.56
Eclectic 469 Eclectic 0.62 0.62
Eclectic 472 Eclectic 0.54 0.54
bungalow 527 Eclectic 0.52 0.52
bungalow 530 Eclectic 0.58 0.58
Eclectic 540 Eclectic 0.42 0.42

Prediction Accuracy

This query returns the prediction accuracy:

SELECT (SELECT count(sn) FROM rf_housing_predict
  WHERE homestyle = prediction) / (SELECT count(sn)
    FROM rf_housing_predict) AS PA;
pa
0.77777777777777777778