Input
This example uses these input tables:
- Input table: housing_test, which has 54 observations of 14 variables
- ModelTable: rft_model, output by DecisionForest Example 1: Classification Tree without Out-of-Bag Error
sn | price | lotsize | bedrooms | bathrms | stories | driveway | recroom | fullbase | gashw | airco | garagepl | prefarea | homestyle |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
13 | 27000 | 1700 | 3 | 1 | 2 | yes | no | no | no | no | 0 | no | Classic |
16 | 37900 | 3185 | 2 | 1 | 1 | yes | no | no | no | yes | 0 | no | Classic |
25 | 42000 | 4960 | 2 | 1 | 1 | yes | no | no | no | no | 0 | no | Classic |
38 | 67000 | 5170 | 3 | 1 | 4 | yes | no | no | no | yes | 0 | no | Eclectic |
53 | 68000 | 9166 | 2 | 1 | 1 | yes | no | yes | no | yes | 2 | no | Eclectic |
104 | 132000 | 3500 | 4 | 2 | 2 | yes | no | no | yes | no | 2 | no | bungalow |
111 | 43000 | 5076 | 3 | 1 | 1 | no | no | no | no | no | 0 | no | Classic |
117 | 93000 | 3760 | 3 | 1 | 2 | yes | no | no | yes | no | 2 | no | Eclectic |
132 | 44500 | 3850 | 3 | 1 | 2 | yes | no | no | no | no | 0 | no | Classic |
140 | 43000 | 3750 | 3 | 1 | 2 | yes | no | no | no | no | 0 | no | Classic |
142 | 40000 | 2650 | 3 | 1 | 2 | yes | no | yes | no | no | 1 | no | Classic |
157 | 60000 | 2953 | 3 | 1 | 2 | yes | no | yes | no | yes | 0 | no | Eclectic |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
SQL Call
Use the Accumulate argument to pass the homestyle variable, to easily compare the actual and predicted response for each observation.
CREATE MULTISET TABLE rf_housing_predict AS ( SELECT * FROM Forest_Predict@coprocessor ( ON housing_test PARTITION BY ANY ON rft_model AS ModelTable DIMENSION USING NumericInputs ('price ', 'lotsize ', 'bedrooms ', 'bathrms ', 'stories ', 'garagepl') CategoricalInputs ('driveway ', 'recroom ', 'fullbase ', 'gashw ', 'airco ', 'prefarea') IDColumn ('sn') Accumulate ('homestyle') Detailed ('false') ) AS dt ) WITH DATA;
Output
The predicted response is in the prediction column and the original classification values are in the homestyle column. The upper and lower confidence intervals are also shown in the output table.
This query returns the following table:
SELECT * FROM rf_housing_predict ORDER BY 2;
homestyle | sn | prediction | confidence_lower | confidence_upper |
---|---|---|---|---|
Classic | 13 | Classic | 0.6 | 0.6 |
Classic | 16 | Classic | 0.56 | 0.56 |
Classic | 25 | Classic | 0.54 | 0.54 |
Eclectic | 38 | Eclectic | 0.7 | 0.7 |
Eclectic | 53 | Eclectic | 0.54 | 0.54 |
bungalow | 104 | bungalow | 0.36 | 0.36 |
Classic | 111 | Classic | 0.54 | 0.54 |
Eclectic | 117 | Eclectic | 0.46 | 0.46 |
Classic | 132 | Classic | 0.54 | 0.54 |
Classic | 140 | Classic | 0.52 | 0.52 |
Classic | 142 | Eclectic | 0.5 | 0.5 |
Eclectic | 157 | Eclectic | 0.64 | 0.64 |
Eclectic | 161 | Eclectic | 0.74 | 0.74 |
bungalow | 162 | Eclectic | 0.46 | 0.46 |
Eclectic | 176 | Eclectic | 0.48 | 0.48 |
Eclectic | 177 | Eclectic | 0.56 | 0.56 |
Classic | 195 | Classic | 0.76 | 0.76 |
Classic | 198 | Classic | 0.48 | 0.48 |
Eclectic | 224 | Eclectic | 0.56 | 0.56 |
Classic | 234 | Classic | 0.64 | 0.64 |
Classic | 237 | Classic | 0.48 | 0.48 |
Classic | 239 | Classic | 0.52 | 0.52 |
Classic | 249 | Classic | 0.7 | 0.7 |
Classic | 251 | Classic | 0.6 | 0.6 |
Eclectic | 254 | Eclectic | 0.66 | 0.66 |
Eclectic | 255 | Eclectic | 0.6 | 0.6 |
Classic | 260 | Eclectic | 0.5 | 0.5 |
Eclectic | 274 | Eclectic | 0.66 | 0.66 |
Classic | 294 | Classic | 0.62 | 0.62 |
Eclectic | 301 | Classic | 0.56 | 0.56 |
Eclectic | 306 | Eclectic | 0.7 | 0.7 |
Eclectic | 317 | Eclectic | 0.5 | 0.5 |
bungalow | 329 | Eclectic | 0.52 | 0.52 |
bungalow | 339 | bungalow | 0.56 | 0.56 |
Eclectic | 340 | Eclectic | 0.54 | 0.54 |
Eclectic | 353 | Eclectic | 0.44 | 0.44 |
Eclectic | 355 | Classic | 0.4 | 0.4 |
Eclectic | 364 | Eclectic | 0.54 | 0.54 |
bungalow | 367 | bungalow | 0.52 | 0.52 |
bungalow | 377 | Eclectic | 0.46 | 0.46 |
Eclectic | 401 | Eclectic | 0.56 | 0.56 |
Eclectic | 403 | Eclectic | 0.56 | 0.56 |
Eclectic | 408 | Eclectic | 0.56 | 0.56 |
Eclectic | 411 | Eclectic | 0.54 | 0.54 |
Eclectic | 440 | Eclectic | 0.66 | 0.66 |
Eclectic | 441 | Classic | 0.5 | 0.5 |
Eclectic | 443 | Classic | 0.52 | 0.52 |
Classic | 459 | Classic | 0.74 | 0.74 |
Classic | 463 | Eclectic | 0.56 | 0.56 |
Eclectic | 469 | Eclectic | 0.62 | 0.62 |
Eclectic | 472 | Eclectic | 0.54 | 0.54 |
bungalow | 527 | Eclectic | 0.52 | 0.52 |
bungalow | 530 | Eclectic | 0.58 | 0.58 |
Eclectic | 540 | Eclectic | 0.42 | 0.42 |
Prediction Accuracy
This query returns the prediction accuracy:
SELECT (SELECT count(sn) FROM rf_housing_predict WHERE homestyle = prediction) / (SELECT count(sn) FROM rf_housing_predict) AS PA;
pa |
---|
0.77777777777777777778 |