XGBoostPredict for Regression - Analytics Database

Database Analytic Functions

Deployment
VantageCloud
VantageCore
Edition
Enterprise
IntelliFlex
VMware
Product
Analytics Database
Release Number
17.20
Published
June 2022
Language
English (United States)
Last Update
2024-04-06
dita:mapPath
gjn1627595495337.ditamap
dita:ditavalPath
ayr1485454803741.ditaval
dita:id
jmh1512506877710
Product Category
Teradata Vantageā„¢

This example shows how to use XGBoost for regression: first use TD_XGBoost to train a regression model, and then use TD_XGBoostPredict to test the model.

The following section shows the Output (xgboost_results) table that was generated as described in TD_XGBoost for Regression. It serves as the input regression model table for TD_XGBoostPredict.

Input Model Table: xgboost_res

This is the same model that was generated using TD_XGBoost (regression option) on some input data (with the same columns).

  task_index tree_num iter tree_order regression_tree
0 3 1 2 0 {"id_":1,"sum_":0.000000,"sumSq_":0.000000,"size_":1,"maxDepth_":0,"value_":0.000000,"nodeType_":"REGRESSION_LEAF","prediction_":0.000000}
1 0 1 3 0 {"id_":1,"sum_":-0.007523,"sumSq_":391.724642,"size_":5,"maxDepth_":3,"nodeType_":"REGRESSION_NODE","split_":{"splitValue_":2.000000,"attr_":"col_2","type_":"REGRESSION_NUMERIC_SPLIT","score_":114.766845,"scoreImprove_":114.766845,"leftNodeSize_":1,"rightNodeSize_":4},"leftChild_":{"id_":2,"sum_":-9.583440,"sumSq_":91.842314,"size_":1,"maxDepth_":0,"value_":-9.583440,"nodeType_":"REGRESSION_LEAF","prediction_":-0.008268},"rightChild_":{"id_":3,"sum_":9.575916,"sumSq_":299.882329,"size_":4,"maxDepth_":2,"nodeType_":"REGRESSION_NODE","split_":{"splitValue_":3.500000,"attr_":"col_2","type_":"REGRESSION_NUMERIC_SPLIT","score_":224.467449,"scoreImprove_":224.467449,"leftNodeSize_":1,"rightNodeSize_":3},"leftChild_":{"id_":6,"sum_":15.368978,"sumSq_":236.205471,"size_":1,"maxDepth_":0,"value_":15.368978,"nodeType_":"REGRESSION_LEAF","prediction_":0.015475},"rightChild_":{"id_":7,"sum_":-5.793061,"sumSq_":63.676858,"size_":3,"maxDepth_":1,"nodeType_":"REGRESSION_NODE","split_":{"splitValue_":11.500000,"attr_":"col_1","type_":"REGRESSION_NUMERIC_SPLIT","score_":47.990338,"scoreImprove_":47.990338,"leftNodeSize_":2,"rightNodeSize_":1},"leftChild_":{"id_":14,"sum_":1.794244,"sumSq_":6.109656,"size_":2,"maxDepth_":0,"value_":0.897122,"nodeType_":"REGRESSION_LEAF","prediction_":0.001436},"rightChild_":{"id_":15,"sum_":-7.587305,"sumSq_":57.567202,"size_":1,"maxDepth_":0,"value_":-7.587305,"nodeType_":"REGRESSION_LEAF","prediction_":-0.006339}}}}
2 3 1 3 0 {"id_":1,"sum_":0.000000,"sumSq_":0.000000,"size_":1,"maxDepth_":0,"value_":0.000000,"nodeType_":"REGRESSION_LEAF","prediction_":0.000000}
3 0 -1 -1 -1 {"lossType":"MSE","numBoostedTrees":4,"iterNum":3,"avgResponses":20.600000}
4 3 1 1 0 {"id_":1,"sum_":0.000000,"sumSq_":0.000000,"size_":1,"maxDepth_":0,"value_":0.000000,"nodeType_":"REGRESSION_LEAF","prediction_":0.000000}
5 0 1 2 0 {"id_":1,"sum_":-0.003769,"sumSq_":392.461463,"size_":5,"maxDepth_":3,"nodeType_":"REGRESSION_NODE","split_":{"splitValue_":2.000000,"attr_":"col_2","type_":"REGRESSION_NUMERIC_SPLIT","score_":114.983184,"scoreImprove_":114.983184,"leftNodeSize_":1,"rightNodeSize_":4},"leftChild_":{"id_":2,"sum_":-9.591716,"sumSq_":92.001008,"size_":1,"maxDepth_":0,"value_":-9.591716,"nodeType_":"REGRESSION_LEAF","prediction_":-0.008276},"rightChild_":{"id_":3,"sum_":9.587947,"sumSq_":300.460456,"size_":4,"maxDepth_":2,"nodeType_":"REGRESSION_NODE","split_":{"splitValue_":3.500000,"attr_":"col_2","type_":"REGRESSION_NUMERIC_SPLIT","score_":224.899861,"scoreImprove_":224.899861,"leftNodeSize_":1,"rightNodeSize_":3},"leftChild_":{"id_":6,"sum_":15.384477,"sumSq_":236.682120,"size_":1,"maxDepth_":0,"value_":15.384477,"nodeType_":"REGRESSION_LEAF","prediction_":0.015499},"rightChild_":{"id_":7,"sum_":-5.796530,"sumSq_":63.778336,"size_":3,"maxDepth_":1,"nodeType_":"REGRESSION_NODE","split_":{"splitValue_":11.500000,"attr_":"col_1","type_":"REGRESSION_NUMERIC_SPLIT","score_":48.078415,"scoreImprove_":48.078415,"leftNodeSize_":2,"rightNodeSize_":1},"leftChild_":{"id_":14,"sum_":1.797120,"sumSq_":6.114820,"size_":2,"maxDepth_":0,"value_":0.898560,"nodeType_":"REGRESSION_LEAF","prediction_":0.001438},"rightChild_":{"id_":15,"sum_":-7.593650,"sumSq_":57.663516,"size_":1,"maxDepth_":0,"value_":-7.593650,"nodeType_":"REGRESSION_LEAF","prediction_":-0.006344}}}}
6 1 1 3 0 {"id_":1,"sum_":0.000000,"sumSq_":0.000000,"size_":1,"maxDepth_":0,"value_":0.000000,"nodeType_":"REGRESSION_LEAF","prediction_":0.000000}
7 0 1 1 0 {"id_":1,"sum_":-0.000000,"sumSq_":393.200000,"size_":5,"maxDepth_":3,"nodeType_":"REGRESSION_NODE","split_":{"splitValue_":2.000000,"attr_":"col_2","type_":"REGRESSION_NUMERIC_SPLIT","score_":115.200000,"scoreImprove_":115.200000,"leftNodeSize_":1,"rightNodeSize_":4},"leftChild_":{"id_":2,"sum_":-9.600000,"sumSq_":92.160000,"size_":1,"maxDepth_":0,"value_":-9.600000,"nodeType_":"REGRESSION_LEAF","prediction_":-0.008284},"rightChild_":{"id_":3,"sum_":9.600000,"sumSq_":301.040000,"size_":4,"maxDepth_":2,"nodeType_":"REGRESSION_NODE","split_":{"splitValue_":3.500000,"attr_":"col_2","type_":"REGRESSION_NUMERIC_SPLIT","score_":225.333333,"scoreImprove_":225.333333,"leftNodeSize_":1,"rightNodeSize_":3},"leftChild_":{"id_":6,"sum_":15.400000,"sumSq_":237.160000,"size_":1,"maxDepth_":0,"value_":15.400000,"nodeType_":"REGRESSION_LEAF","prediction_":0.015523},"rightChild_":{"id_":7,"sum_":-5.800000,"sumSq_":63.880000,"size_":3,"maxDepth_":1,"nodeType_":"REGRESSION_NODE","split_":{"splitValue_":11.500000,"attr_":"col_1","type_":"REGRESSION_NUMERIC_SPLIT","score_":48.166667,"scoreImprove_":48.166667,"leftNodeSize_":2,"rightNodeSize_":1},"leftChild_":{"id_":14,"sum_":1.800000,"sumSq_":6.120000,"size_":2,"maxDepth_":0,"value_":0.900000,"nodeType_":"REGRESSION_LEAF","prediction_":0.001440},"rightChild_":{"id_":15,"sum_":-7.600000,"sumSq_":57.760000,"size_":1,"maxDepth_":0,"value_":-7.600000,"nodeType_":"REGRESSION_LEAF","prediction_":-0.006350}}}}
8 1 1 2 0 {"id_":1,"sum_":0.000000,"sumSq_":0.000000,"size_":1,"maxDepth_":0,"value_":0.000000,"nodeType_":"REGRESSION_LEAF","prediction_":0.000000}
9 1 1 1 0 {"id_":1,"sum_":0.000000,"sumSq_":0.000000,"size_":1,"maxDepth_":0,"value_":0.000000,"nodeType_":"REGRESSION_LEAF","prediction_":0.000000}
10 2 1 1 0 {"id_":1,"sum_":0.000000,"sumSq_":220.500000,"size_":2,"maxDepth_":3,"nodeType_":"REGRESSION_NODE","split_":{"splitValue_":17.500000,"attr_":"col_1","type_":"REGRESSION_NUMERIC_SPLIT","score_":220.500000,"scoreImprove_":220.500000,"leftNodeSize_":1,"rightNodeSize_":1},"leftChild_":{"id_":2,"sum_":-10.500000,"sumSq_":110.250000,"size_":1,"maxDepth_":0,"value_":-10.500000,"nodeType_":"REGRESSION_LEAF","prediction_":-0.009223},"rightChild_":{"id_":3,"sum_":10.500000,"sumSq_":110.250000,"size_":1,"maxDepth_":0,"value_":10.500000,"nodeType_":"REGRESSION_LEAF","prediction_":0.009223}}
11 2 1 3 0 {"id_":1,"sum_":0.000000,"sumSq_":219.726348,"size_":2,"maxDepth_":3,"nodeType_":"REGRESSION_NODE","split_":{"splitValue_":17.500000,"attr_":"col_1","type_":"REGRESSION_NUMERIC_SPLIT","score_":219.726348,"scoreImprove_":219.726348,"leftNodeSize_":1,"rightNodeSize_":1},"leftChild_":{"id_":2,"sum_":-10.481564,"sumSq_":109.863174,"size_":1,"maxDepth_":0,"value_":-10.481564,"nodeType_":"REGRESSION_LEAF","prediction_":-0.009203},"rightChild_":{"id_":3,"sum_":10.481564,"sumSq_":109.863174,"size_":1,"maxDepth_":0,"value_":10.481564,"nodeType_":"REGRESSION_LEAF","prediction_":0.009203}}
12 2 1 2 0 {"id_":1,"sum_":0.000000,"sumSq_":220.112797,"size_":2,"maxDepth_":3,"nodeType_":"REGRESSION_NODE","split_":{"splitValue_":17.500000,"attr_":"col_1","type_":"REGRESSION_NUMERIC_SPLIT","score_":220.112797,"scoreImprove_":220.112797,"leftNodeSize_":1,"rightNodeSize_":1},"leftChild_":{"id_":2,"sum_":-10.490777,"sumSq_":110.056399,"size_":1,"maxDepth_":0,"value_":-10.490777,"nodeType_":"REGRESSION_LEAF","prediction_":-0.009213},"rightChild_":{"id_":3,"sum_":10.490777,"sumSq_":110.056399,"size_":1,"maxDepth_":0,"value_":10.490777,"nodeType_":"REGRESSION_LEAF","prediction_":0.009213}}

The following sections show the input table, SQL query and output table when testing the regression model using TD_XGBoostPredict.

Input Table: xgboost_reg_test

  id medv col_1 col_2 col_3
0 3 17.9 4 5.45 0.97022
1 1 23.9 18 4.14 0.54054
2 2 14 29 9.97 0.26351

SQL Function Call: TD_XGBoostPredict Query

SELECT * FROM TD_XGBoostPredict(
    ON xgboost_reg_test AS inputtable partition by ANY
    ON xgboost_res AS modeltable dimension
    USING
      IdColumn('id')
      ModelType('Regression')
) AS dt;

Output Table

  id Prediction Confidence_Lower Confidence_upper
0 3 20.592225 20.58509094229030 20.599359057709700
1 1 20.602868666666700 20.58788390759070 20.61785342574260
2 2 20.602868666666700 20.58788390759070 20.61785342574260