This example uses home sales data to create a classification tree that predicts home style, which can be input to the Forest_Predict Example. By default, the function does not output the out-of-bag estimate of error rate.
Input
The following table describes the home sales data contained in the InputTable. There are six numerical predictors and six categorical predictors. The response variable is homestyle.
Column | Description |
---|---|
price | Sale price in U. S. dollars (numeric) |
lotsize | Lot size in square feet (numeric) |
bedrooms | Number of bedrooms (numeric) |
bathrms | Number of full bathrooms (numeric) |
stories | Number of stories, excluding basement (numeric) |
driveway | Whether the house has a driveway—yes or no (categorical) |
recroom | Whether the house has a recreation room—yes or no (categorical) |
fullbase | Whether the house has a full finished basement—yes or no (categorical) |
gashw | Whether the house uses gas to heat water—yes or no (categorical) |
airco | Whether the house has central air conditioning—yes or no (categorical) |
garagepl | Number of garage places (numeric) |
prefarea | Whether the house is in a preferred neighborhood—yes or no (categorical) |
homestyle | Style of home (response variable) |
sn | price | lotsize | bedrooms | bathrms | stories | driveway | recroom | fullbase | gashw | airco | garagepl | prefarea | homestyle |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | 42000 | 5850 | 3 | 1 | 2 | yes | no | yes | no | no | 1 | no | Classic |
2 | 38500 | 4000 | 2 | 1 | 1 | yes | no | no | no | no | 0 | no | Classic |
3 | 49500 | 3060 | 3 | 1 | 1 | yes | no | no | no | no | 0 | no | Classic |
4 | 60500 | 6650 | 3 | 1 | 2 | yes | yes | no | no | no | 0 | no | Eclectic |
5 | 61000 | 6360 | 2 | 1 | 1 | yes | no | no | no | no | 0 | no | Eclectic |
6 | 66000 | 4160 | 3 | 1 | 1 | yes | yes | yes | no | yes | 0 | no | Eclectic |
7 | 66000 | 3880 | 3 | 2 | 2 | yes | no | yes | no | no | 2 | no | Eclectic |
8 | 69000 | 4160 | 3 | 1 | 3 | yes | no | no | no | no | 0 | no | Eclectic |
9 | 83800 | 4800 | 3 | 1 | 1 | yes | yes | yes | no | no | 0 | no | Eclectic |
10 | 88500 | 5500 | 3 | 2 | 4 | yes | yes | no | no | yes | 1 | no | Eclectic |
11 | 90000 | 7200 | 3 | 2 | 1 | yes | no | yes | no | yes | 3 | no | Eclectic |
12 | 30500 | 3000 | 2 | 1 | 1 | no | no | no | no | no | 0 | no | Classic |
14 | 36000 | 2880 | 3 | 1 | 1 | no | no | no | no | no | 0 | no | Classic |
15 | 37000 | 3600 | 2 | 1 | 1 | yes | no | no | no | no | 0 | no | Classic |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
SQL Call
This call uses default values for the MaxDepth, MinNodeSize, and Variance arguments, and builds 50 trees on two worker nodes. It sets both seed values to 100 for repeatability. Because TreeType is 'classification' and there are 12 prediction variables, Mtry is 3 (round(sqrt(12)).
SELECT * FROM DecisionForest (
ON housing_train AS InputTable
OUT TABLE OutputTable (rft_model)
OUT TABLE MonitorTable (rf_monitortable)
USING
TreeType ('classification')
ResponseColumn ('homestyle')
NumericInputs ('price','lotsize','bedrooms','bathrms','stories','garagepl')
CategoricalInputs
('driveway','recroom','fullbase','gashw','airco','prefarea')
MaxDepth (12)
MinNodeSize (1)
NumTrees (50)
Variance (0.0)
Mtry ('3')
MtrySeed ('100')
Seed ('100')
) AS dt;
Output
message |
---|
Computing 48 classification trees. Each worker is computing 16 trees. Each tree will contain approximately 164 points. Poisson sampling parameter: 1.00 Query finished in 5.706 seconds. Decision forest created. |
This query returns the following table:
SELECT task_index, tree_num, CAST (tree AS VARCHAR(50)) FROM rft_model ORDER BY 1;
task_index | tree_num | CAST (tree AS VARCHAR(50)) |
---|---|---|
0 | 0 | {"responseCounts_":{"Eclectic":148,"bungalow":30," |
0 | 1 | {"responseCounts_":{"Eclectic":158,"bungalow":26," |
0 | 2 | {"responseCounts_":{"Eclectic":120,"bungalow":38," |
0 | 3 | {"responseCounts_":{"Eclectic":166,"bungalow":29," |
0 | 4 | {"responseCounts_":{"Eclectic":138,"bungalow":32," |
0 | 5 | {"responseCounts_":{"Eclectic":158,"bungalow":34," |
0 | 6 | {"responseCounts_":{"Eclectic":168,"bungalow":32," |
0 | 7 | {"responseCounts_":{"Eclectic":145,"bungalow":40," |
0 | 8 | {"responseCounts_":{"Eclectic":150,"bungalow":34," |
0 | 9 | {"responseCounts_":{"Eclectic":156,"bungalow":42," |
0 | 10 | {"responseCounts_":{"Eclectic":148,"bungalow":18," |
0 | 11 | {"responseCounts_":{"Eclectic":147,"bungalow":20," |
0 | 12 | {"responseCounts_":{"Eclectic":150,"bungalow":31," |
0 | 13 | {"responseCounts_":{"Eclectic":135,"bungalow":32," |
0 | 14 | {"responseCounts_":{"Eclectic":139,"bungalow":24," |
0 | 15 | {"responseCounts_":{"Eclectic":146,"bungalow":27," |
0 | 16 | {"responseCounts_":{"Eclectic":152,"bungalow":23," |
0 | 17 | {"responseCounts_":{"Eclectic":135,"bungalow":23," |
0 | 18 | {"responseCounts_":{"Eclectic":148,"bungalow":29," |
0 | 19 | {"responseCounts_":{"Eclectic":166,"bungalow":33," |
0 | 20 | {"responseCounts_":{"Eclectic":142,"bungalow":28," |
0 | 21 | {"responseCounts_":{"Eclectic":172,"bungalow":27," |
0 | 22 | {"responseCounts_":{"Eclectic":147,"bungalow":37," |
0 | 23 | {"responseCounts_":{"Eclectic":158,"bungalow":31," |
0 | 24 | {"responseCounts_":{"Eclectic":158,"bungalow":33," |
1 | 0 | {"responseCounts_":{"Eclectic":140,"bungalow":44," |
1 | 1 | {"responseCounts_":{"Eclectic":161,"bungalow":28," |
1 | 2 | {"responseCounts_":{"Eclectic":131,"bungalow":25," |
1 | 3 | {"responseCounts_":{"Eclectic":167,"bungalow":28," |
1 | 4 | {"responseCounts_":{"Eclectic":150,"bungalow":19," |
1 | 5 | {"responseCounts_":{"Eclectic":158,"bungalow":24," |
1 | 6 | {"responseCounts_":{"Eclectic":177,"bungalow":32," |
1 | 7 | {"responseCounts_":{"Eclectic":156,"bungalow":24," |
1 | 8 | {"responseCounts_":{"Eclectic":156,"bungalow":37," |
1 | 9 | {"responseCounts_":{"Eclectic":165,"bungalow":24," |
1 | 10 | {"responseCounts_":{"Eclectic":135,"bungalow":29," |
1 | 11 | {"responseCounts_":{"Eclectic":140,"bungalow":20," |
1 | 12 | {"responseCounts_":{"Eclectic":156,"bungalow":24," |
1 | 13 | {"responseCounts_":{"Eclectic":147,"bungalow":34," |
1 | 14 | {"responseCounts_":{"Eclectic":151,"bungalow":22," |
1 | 15 | {"responseCounts_":{"Eclectic":161,"bungalow":18," |
1 | 16 | {"responseCounts_":{"Eclectic":156,"bungalow":19," |
1 | 17 | {"responseCounts_":{"Eclectic":126,"bungalow":29," |
1 | 18 | {"responseCounts_":{"Eclectic":148,"bungalow":26," |
1 | 19 | {"responseCounts_":{"Eclectic":177,"bungalow":21," |
1 | 20 | {"responseCounts_":{"Eclectic":137,"bungalow":31," |
1 | 21 | {"responseCounts_":{"Eclectic":171,"bungalow":28," |
1 | 22 | {"responseCounts_":{"Eclectic":146,"bungalow":30," |
1 | 23 | {"responseCounts_":{"Eclectic":149,"bungalow":21," |
1 | 24 | {"responseCounts_":{"Eclectic":158,"bungalow":18," |