This example uses home sales data to create a classification tree that predicts home style, which can be input to XGBoostPredict Example 1: Binary Classification.
Input
For descriptions of InputTable columns, see DecisionForest Example 1: Classification Tree without Out-of-Bag Error.
sn | price | lotsize | bedrooms | bathrms | stories | driveway | recroom | fullbase | gashw | airco | garagepl | prefarea | homestyle |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
2 | 38500 | 4000 | 2 | 1 | 1 | yes | no | no | no | no | 0 | no | Classic |
4 | 60500 | 6650 | 3 | 1 | 2 | yes | yes | no | no | no | 0 | no | Classic |
6 | 66000 | 4160 | 3 | 1 | 1 | yes | yes | yes | no | yes | 0 | no | Eclectic |
8 | 69000 | 4160 | 3 | 1 | 3 | yes | no | no | no | no | 0 | no | Eclectic |
10 | 88500 | 5500 | 3 | 2 | 4 | yes | yes | no | no | yes | 1 | no | Eclectic |
12 | 30500 | 3000 | 2 | 1 | 1 | no | no | no | no | no | 0 | no | Eclectic |
14 | 36000 | 2880 | 3 | 1 | 1 | no | no | no | no | no | 0 | no | Classic |
18 | 40750 | 5200 | 4 | 1 | 3 | yes | no | no | no | no | 0 | no | Classic |
20 | 45000 | 3986 | 3 | 2 | 2 | no | yes | yes | no | no | 1 | no | Classic |
22 | 65900 | 4510 | 4 | 2 | 1 | yes | no | yes | no | no | 0 | no | Classic |
SQL Call
SELECT * FROM XGBoost ( ON housing_train_binary AS InputTable OUT TABLE OutputTable (xgboost_model) USING ResponseColumn ('homestyle') PredictionType ('classification') NumericInputs ('price','lotsize','bedrooms','bathrms','stories','garagepl') CategoricalInputs ('driveway','recroom','fullbase','gashw','airco', 'prefarea') LossFunction ('binomial') IterNum (10) MaxDepth (10) MinNodeSize (1) RegularizationLambda (1) ShrinkageFactor (0.1) IDColumn ('sn') NumBoostedTrees (2) ) AS dt;
Output
message |
---|
Parameters: Number of boosting iterations : 10 Number of boosted trees : 2 Number of total trees (all subtrees): 20 Prediction Type : CLASSIFICATION LossFunction : BINOMIAL Regularization : 1.0 Shrinkage : 0.1 MaxDepth : 10 MinNodeSize : 1 Variance : 0.0 Seed : 1 ColumnSubSampling Features: 12 XGBoost model created in table specified in OutputTable argument |
This query returns the following table:
SELECT tree_id, iter, class_num, cast (tree AS VARCHAR(30)), cast(region_prediction AS varchar(30)) FROM xgboost_model ORDER BY 1,2,3;
For simplicity, the last two output columns show only the first 30 characters of each value.
tree_id | iter | class_num | cast(tree as character varying(30)) | cast(region_prediction as character varying(30)) |
---|---|---|---|---|
-1 | -1 | -1 | {"classifier":"CLASSIFICATION" | |
0 | 1 | {"sum_":1.200000101064802E-6," | {"1792":0.06969074,"1280":-0.1 | |
0 | 2 | {"sum_":-0.1465209100000503,"s | {"384":0.042531442,"385":0.030 | |
0 | 3 | {"sum_":0.281402000000032,"sum | {"1664":0.052631423,"1665":0.0 | |
0 | 4 | {"sum_":1.2547231599999822,"su | {"1280":-0.06937855,"1281":-0. | |
0 | 5 | {"sum_":1.915482170000011,"sum | {"768":0.027756682,"1538":0.03 | |
0 | 6 | {"sum_":1.9837604200000016,"su | {"768":0.026932025,"1538":0.03 | |
0 | 7 | {"sum_":2.091817570000022,"sum | {"768":0.026147524,"769":0.035 | |
0 | 8 | {"sum_":2.360585519999998,"sum | {"768":0.023378344,"769":0.025 | |
0 | 9 | {"sum_":2.6234908400000014,"su | {"768":0.02468738,"769":0.0227 | |
0 | 10 | {"sum_":3.1594640700000056,"su | {"770":0.024006711,"771":0.023 | |
1 | 1 | {"sum_":-2.400000108204736E-6, | {"1664":0.07176345,"1536":0.07 | |
1 | 2 | {"sum_":1.2196362000000556,"su | {"1536":0.067304045,"1152":-0. | |
1 | 3 | {"sum_":1.693821100000001,"sum | {"1536":0.06236268,"512":-0.07 | |
1 | 4 | {"sum_":1.534895909999999,"sum | {"1536":0.053727582,"769":0.02 | |
1 | 5 | {"sum_":1.6560535500000007,"su | {"256":-0.04895391,"257":-0.06 | |
1 | 6 | {"sum_":1.8388899100000082,"su | {"512":-0.046615,"513":-0.0458 | |
1 | 7 | {"sum_":2.205078229999988,"sum | {"1540":0.026542466,"1541":0.0 | |
1 | 8 | {"sum_":2.621658340000005,"sum | {"258":-0.040129118,"259":-0.0 | |
1 | 9 | {"sum_":2.92402716000001,"sumS | {"16":-0.042044364,"20":-0.040 | |
1 | 10 | {"sum_":3.2108298899999914,"su | {"260":-0.03580759,"522":-0.03 |