This example shows how to use XGBoost for regression: first use TD_XGBoost to train a regression model, and then use TD_XGBoostPredict to test the model.
The following section shows the Output (xgboost_results) table that was generated as described in TD_XGBoost for Regression. It serves as the input regression model table for TD_XGBoostPredict.
Input Model Table: xgboost_res
This is the same model that was generated using TD_XGBoost (regression option) on some input data (with the same columns).
task_index | tree_num | iter | tree_order | regression_tree | |
---|---|---|---|---|---|
0 | 3 | 1 | 2 | 0 | {"id_":1,"sum_":0.000000,"sumSq_":0.000000,"size_":1,"maxDepth_":0,"value_":0.000000,"nodeType_":"REGRESSION_LEAF","prediction_":0.000000} |
1 | 0 | 1 | 3 | 0 | {"id_":1,"sum_":-0.007523,"sumSq_":391.724642,"size_":5,"maxDepth_":3,"nodeType_":"REGRESSION_NODE","split_":{"splitValue_":2.000000,"attr_":"col_2","type_":"REGRESSION_NUMERIC_SPLIT","score_":114.766845,"scoreImprove_":114.766845,"leftNodeSize_":1,"rightNodeSize_":4},"leftChild_":{"id_":2,"sum_":-9.583440,"sumSq_":91.842314,"size_":1,"maxDepth_":0,"value_":-9.583440,"nodeType_":"REGRESSION_LEAF","prediction_":-0.008268},"rightChild_":{"id_":3,"sum_":9.575916,"sumSq_":299.882329,"size_":4,"maxDepth_":2,"nodeType_":"REGRESSION_NODE","split_":{"splitValue_":3.500000,"attr_":"col_2","type_":"REGRESSION_NUMERIC_SPLIT","score_":224.467449,"scoreImprove_":224.467449,"leftNodeSize_":1,"rightNodeSize_":3},"leftChild_":{"id_":6,"sum_":15.368978,"sumSq_":236.205471,"size_":1,"maxDepth_":0,"value_":15.368978,"nodeType_":"REGRESSION_LEAF","prediction_":0.015475},"rightChild_":{"id_":7,"sum_":-5.793061,"sumSq_":63.676858,"size_":3,"maxDepth_":1,"nodeType_":"REGRESSION_NODE","split_":{"splitValue_":11.500000,"attr_":"col_1","type_":"REGRESSION_NUMERIC_SPLIT","score_":47.990338,"scoreImprove_":47.990338,"leftNodeSize_":2,"rightNodeSize_":1},"leftChild_":{"id_":14,"sum_":1.794244,"sumSq_":6.109656,"size_":2,"maxDepth_":0,"value_":0.897122,"nodeType_":"REGRESSION_LEAF","prediction_":0.001436},"rightChild_":{"id_":15,"sum_":-7.587305,"sumSq_":57.567202,"size_":1,"maxDepth_":0,"value_":-7.587305,"nodeType_":"REGRESSION_LEAF","prediction_":-0.006339}}}} |
2 | 3 | 1 | 3 | 0 | {"id_":1,"sum_":0.000000,"sumSq_":0.000000,"size_":1,"maxDepth_":0,"value_":0.000000,"nodeType_":"REGRESSION_LEAF","prediction_":0.000000} |
3 | 0 | -1 | -1 | -1 | {"lossType":"MSE","numBoostedTrees":4,"iterNum":3,"avgResponses":20.600000} |
4 | 3 | 1 | 1 | 0 | {"id_":1,"sum_":0.000000,"sumSq_":0.000000,"size_":1,"maxDepth_":0,"value_":0.000000,"nodeType_":"REGRESSION_LEAF","prediction_":0.000000} |
5 | 0 | 1 | 2 | 0 | {"id_":1,"sum_":-0.003769,"sumSq_":392.461463,"size_":5,"maxDepth_":3,"nodeType_":"REGRESSION_NODE","split_":{"splitValue_":2.000000,"attr_":"col_2","type_":"REGRESSION_NUMERIC_SPLIT","score_":114.983184,"scoreImprove_":114.983184,"leftNodeSize_":1,"rightNodeSize_":4},"leftChild_":{"id_":2,"sum_":-9.591716,"sumSq_":92.001008,"size_":1,"maxDepth_":0,"value_":-9.591716,"nodeType_":"REGRESSION_LEAF","prediction_":-0.008276},"rightChild_":{"id_":3,"sum_":9.587947,"sumSq_":300.460456,"size_":4,"maxDepth_":2,"nodeType_":"REGRESSION_NODE","split_":{"splitValue_":3.500000,"attr_":"col_2","type_":"REGRESSION_NUMERIC_SPLIT","score_":224.899861,"scoreImprove_":224.899861,"leftNodeSize_":1,"rightNodeSize_":3},"leftChild_":{"id_":6,"sum_":15.384477,"sumSq_":236.682120,"size_":1,"maxDepth_":0,"value_":15.384477,"nodeType_":"REGRESSION_LEAF","prediction_":0.015499},"rightChild_":{"id_":7,"sum_":-5.796530,"sumSq_":63.778336,"size_":3,"maxDepth_":1,"nodeType_":"REGRESSION_NODE","split_":{"splitValue_":11.500000,"attr_":"col_1","type_":"REGRESSION_NUMERIC_SPLIT","score_":48.078415,"scoreImprove_":48.078415,"leftNodeSize_":2,"rightNodeSize_":1},"leftChild_":{"id_":14,"sum_":1.797120,"sumSq_":6.114820,"size_":2,"maxDepth_":0,"value_":0.898560,"nodeType_":"REGRESSION_LEAF","prediction_":0.001438},"rightChild_":{"id_":15,"sum_":-7.593650,"sumSq_":57.663516,"size_":1,"maxDepth_":0,"value_":-7.593650,"nodeType_":"REGRESSION_LEAF","prediction_":-0.006344}}}} |
6 | 1 | 1 | 3 | 0 | {"id_":1,"sum_":0.000000,"sumSq_":0.000000,"size_":1,"maxDepth_":0,"value_":0.000000,"nodeType_":"REGRESSION_LEAF","prediction_":0.000000} |
7 | 0 | 1 | 1 | 0 | {"id_":1,"sum_":-0.000000,"sumSq_":393.200000,"size_":5,"maxDepth_":3,"nodeType_":"REGRESSION_NODE","split_":{"splitValue_":2.000000,"attr_":"col_2","type_":"REGRESSION_NUMERIC_SPLIT","score_":115.200000,"scoreImprove_":115.200000,"leftNodeSize_":1,"rightNodeSize_":4},"leftChild_":{"id_":2,"sum_":-9.600000,"sumSq_":92.160000,"size_":1,"maxDepth_":0,"value_":-9.600000,"nodeType_":"REGRESSION_LEAF","prediction_":-0.008284},"rightChild_":{"id_":3,"sum_":9.600000,"sumSq_":301.040000,"size_":4,"maxDepth_":2,"nodeType_":"REGRESSION_NODE","split_":{"splitValue_":3.500000,"attr_":"col_2","type_":"REGRESSION_NUMERIC_SPLIT","score_":225.333333,"scoreImprove_":225.333333,"leftNodeSize_":1,"rightNodeSize_":3},"leftChild_":{"id_":6,"sum_":15.400000,"sumSq_":237.160000,"size_":1,"maxDepth_":0,"value_":15.400000,"nodeType_":"REGRESSION_LEAF","prediction_":0.015523},"rightChild_":{"id_":7,"sum_":-5.800000,"sumSq_":63.880000,"size_":3,"maxDepth_":1,"nodeType_":"REGRESSION_NODE","split_":{"splitValue_":11.500000,"attr_":"col_1","type_":"REGRESSION_NUMERIC_SPLIT","score_":48.166667,"scoreImprove_":48.166667,"leftNodeSize_":2,"rightNodeSize_":1},"leftChild_":{"id_":14,"sum_":1.800000,"sumSq_":6.120000,"size_":2,"maxDepth_":0,"value_":0.900000,"nodeType_":"REGRESSION_LEAF","prediction_":0.001440},"rightChild_":{"id_":15,"sum_":-7.600000,"sumSq_":57.760000,"size_":1,"maxDepth_":0,"value_":-7.600000,"nodeType_":"REGRESSION_LEAF","prediction_":-0.006350}}}} |
8 | 1 | 1 | 2 | 0 | {"id_":1,"sum_":0.000000,"sumSq_":0.000000,"size_":1,"maxDepth_":0,"value_":0.000000,"nodeType_":"REGRESSION_LEAF","prediction_":0.000000} |
9 | 1 | 1 | 1 | 0 | {"id_":1,"sum_":0.000000,"sumSq_":0.000000,"size_":1,"maxDepth_":0,"value_":0.000000,"nodeType_":"REGRESSION_LEAF","prediction_":0.000000} |
10 | 2 | 1 | 1 | 0 | {"id_":1,"sum_":0.000000,"sumSq_":220.500000,"size_":2,"maxDepth_":3,"nodeType_":"REGRESSION_NODE","split_":{"splitValue_":17.500000,"attr_":"col_1","type_":"REGRESSION_NUMERIC_SPLIT","score_":220.500000,"scoreImprove_":220.500000,"leftNodeSize_":1,"rightNodeSize_":1},"leftChild_":{"id_":2,"sum_":-10.500000,"sumSq_":110.250000,"size_":1,"maxDepth_":0,"value_":-10.500000,"nodeType_":"REGRESSION_LEAF","prediction_":-0.009223},"rightChild_":{"id_":3,"sum_":10.500000,"sumSq_":110.250000,"size_":1,"maxDepth_":0,"value_":10.500000,"nodeType_":"REGRESSION_LEAF","prediction_":0.009223}} |
11 | 2 | 1 | 3 | 0 | {"id_":1,"sum_":0.000000,"sumSq_":219.726348,"size_":2,"maxDepth_":3,"nodeType_":"REGRESSION_NODE","split_":{"splitValue_":17.500000,"attr_":"col_1","type_":"REGRESSION_NUMERIC_SPLIT","score_":219.726348,"scoreImprove_":219.726348,"leftNodeSize_":1,"rightNodeSize_":1},"leftChild_":{"id_":2,"sum_":-10.481564,"sumSq_":109.863174,"size_":1,"maxDepth_":0,"value_":-10.481564,"nodeType_":"REGRESSION_LEAF","prediction_":-0.009203},"rightChild_":{"id_":3,"sum_":10.481564,"sumSq_":109.863174,"size_":1,"maxDepth_":0,"value_":10.481564,"nodeType_":"REGRESSION_LEAF","prediction_":0.009203}} |
12 | 2 | 1 | 2 | 0 | {"id_":1,"sum_":0.000000,"sumSq_":220.112797,"size_":2,"maxDepth_":3,"nodeType_":"REGRESSION_NODE","split_":{"splitValue_":17.500000,"attr_":"col_1","type_":"REGRESSION_NUMERIC_SPLIT","score_":220.112797,"scoreImprove_":220.112797,"leftNodeSize_":1,"rightNodeSize_":1},"leftChild_":{"id_":2,"sum_":-10.490777,"sumSq_":110.056399,"size_":1,"maxDepth_":0,"value_":-10.490777,"nodeType_":"REGRESSION_LEAF","prediction_":-0.009213},"rightChild_":{"id_":3,"sum_":10.490777,"sumSq_":110.056399,"size_":1,"maxDepth_":0,"value_":10.490777,"nodeType_":"REGRESSION_LEAF","prediction_":0.009213}} |
The following sections show the input table, SQL query and output table when testing the regression model using TD_XGBoostPredict.
Input Table: xgboost_reg_test
id | medv | col_1 | col_2 | col_3 | |
---|---|---|---|---|---|
0 | 3 | 17.9 | 4 | 5.45 | 0.97022 |
1 | 1 | 23.9 | 18 | 4.14 | 0.54054 |
2 | 2 | 14 | 29 | 9.97 | 0.26351 |
SQL Function Call: TD_XGBoostPredict Query
SELECT * FROM TD_XGBoostPredict( ON xgboost_reg_test AS inputtable partition by ANY ON xgboost_res AS modeltable dimension USING IdColumn('id') ModelType('Regression') ) AS dt;
Output Table
id | Prediction | Confidence_Lower | Confidence_upper | |
---|---|---|---|---|
0 | 3 | 20.592225 | 20.58509094229030 | 20.599359057709700 |
1 | 1 | 20.602868666666700 | 20.58788390759070 | 20.61785342574260 |
2 | 2 | 20.602868666666700 | 20.58788390759070 | 20.61785342574260 |