This example uses home sales data to create a classification tree that predicts home style, which can be input to XGBoostPredict Example: Binary Classification.
Input
For descriptions of InputTable columns, see DecisionForest Example: TreeType ('classification'), OutOfBag ('false').
sn | price | lotsize | bedrooms | bathrms | stories | driveway | recroom | fullbase | gashw | airco | garagepl | prefarea | homestyle |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
2 | 38500 | 4000 | 2 | 1 | 1 | yes | no | no | no | no | 0 | no | Classic |
4 | 60500 | 6650 | 3 | 1 | 2 | yes | yes | no | no | no | 0 | no | Classic |
6 | 66000 | 4160 | 3 | 1 | 1 | yes | yes | yes | no | yes | 0 | no | Eclectic |
8 | 69000 | 4160 | 3 | 1 | 3 | yes | no | no | no | no | 0 | no | Eclectic |
10 | 88500 | 5500 | 3 | 2 | 4 | yes | yes | no | no | yes | 1 | no | Eclectic |
12 | 30500 | 3000 | 2 | 1 | 1 | no | no | no | no | no | 0 | no | Eclectic |
14 | 36000 | 2880 | 3 | 1 | 1 | no | no | no | no | no | 0 | no | Classic |
18 | 40750 | 5200 | 4 | 1 | 3 | yes | no | no | no | no | 0 | no | Classic |
20 | 45000 | 3986 | 3 | 2 | 2 | no | yes | yes | no | no | 1 | no | Classic |
22 | 65900 | 4510 | 4 | 2 | 1 | yes | no | yes | no | no | 0 | no | Classic |
SQL Call
SELECT * FROM XGBoost ( ON housing_train_binary AS InputTable OUT TABLE OutputTable (xgboost_model) USING ResponseColumn ('homestyle') PredictionType ('classification') NumericInputs ('price','lotsize','bedrooms','bathrms','stories','garagepl') CategoricalInputs ('driveway','recroom','fullbase','gashw','airco', 'prefarea') LossFunction ('binomial') IterNum (10) MaxDepth (10) MinNodeSize (1) RegularizationLambda (1) ShrinkageFactor (0.1) IDColumn ('sn') NumBoostedTrees (2) ) AS dt;
Output
message ---------------------------------------------------------------- Parameters: Number of boosting iterations : 10 Number of boosted trees : 2 Number of total trees (all subtrees): 20 Prediction Type : CLASSIFICATION LossFunction : BINOMIAL Regularization : 1.0 Shrinkage : 0.1 MaxDepth : 10 MinNodeSize : 1 Variance : 0.0 Seed : 1 ColumnSubSampling Features: 12 XGBoost model created in table specified in OutputTable argument
This query returns the following table:
SELECT tree_id, iter, class_num, CAST (tree AS VARCHAR(30)), CAST (region_prediction AS VARCHAR(30)) FROM xgboost_model ORDER BY 1,2,3;
For simplicity, the last two output columns show only the first 30 characters of each value.
tree_id iter class_num tree region_prediction ------- ---- --------- ------------------------------ ------------------------------ -1 -1 -1 {"classifier":"CLASSIFICATION" 0 1 0 {"sum_":1.1999999649514592E-6, {"1792":0.06969074,"1280":-0.1 0 2 0 {"sum_":-0.14652091000000556," {"384":0.042531442,"385":0.030 0 3 0 {"sum_":0.28140193999999746,"s {"1664":0.052631423,"1665":0.0 0 4 0 {"sum_":1.2547231600000008,"su {"1280":-0.06937855,"1281":-0. 0 5 0 {"sum_":1.9154820900000036,"su {"768":0.027756682,"1538":0.03 0 6 0 {"sum_":1.9837604199999974,"su {"768":0.026932025,"1538":0.03 0 7 0 {"sum_":2.0918175700000003,"su {"768":0.026147524,"769":0.035 0 8 0 {"sum_":2.360585519999998,"sum {"768":0.023378344,"769":0.025 0 9 0 {"sum_":2.623491119999999,"sum {"768":0.02468738,"769":0.0227 0 10 0 {"sum_":3.1594645900000033,"su {"770":0.024006711,"771":0.023 1 1 0 {"sum_":-2.3999999940738093E-6 {"1664":0.07176345,"1536":0.07 1 2 0 {"sum_":1.2196361999999998,"su {"1536":0.067304045,"1024":-0. 1 3 0 {"sum_":1.693821099999995,"sum {"1536":0.06236268,"512":-0.07 1 4 0 {"sum_":1.5348959100000008,"su {"1536":0.053727582,"769":0.02 1 5 0 {"sum_":1.6560536700000057,"su {"256":-0.04895391,"257":-0.06 1 6 0 {"sum_":1.8388901200000038,"su {"512":-0.046615,"513":-0.0458 1 7 0 {"sum_":2.2050783699999994,"su {"1540":0.026542466,"1541":0.0 1 8 0 {"sum_":2.6216585700000024,"su {"258":-0.040129118,"259":-0.0 1 9 0 {"sum_":2.92402738,"sumSq_":65 {"16":-0.042044364,"20":-0.040 1 10 0 {"sum_":3.210829989999998,"sum {"260":-0.03580759,"522":-0.03
Download a zip file of all examples and a SQL script file that creates their input tables from the attachment in the left sidebar.