Example: XGBoost for Classification | TD_XGBoost | Teradata Vantage - TD_XGBoost for Classification - Analytics Database

Database Analytic Functions

Deployment
VantageCloud
VantageCore
Edition
Enterprise
IntelliFlex
VMware
Product
Analytics Database
Release Number
17.20
Published
June 2022
Language
English (United States)
Last Update
2024-04-06
dita:mapPath
gjn1627595495337.ditamap
dita:ditavalPath
ayr1485454803741.ditaval
dita:id
jmh1512506877710
Product Category
Teradata Vantageā„¢

This section shows the input table, SQL query, and output tables of an example using TD_XGBoost for classification.

InputTable

The input is a diabetees dataset sample, with 3 feature columns and a target (response) column 'outcome'. It is a binary classification problem with two classes 0 and 1.

ID outcome col_1 col_2 col_3
1 1 23 8.14 0.84054
1 1 20 3.97 0.66351
1 1 0 4.05 0.07022
1 1 15 10.59 0.21719
1 1 0 6.91 0.18836
1 2 20 3.97 0.66351
1 2 20 3.97 0.5405
1 2 16 1.95 2.77974
1 2 0 4.49 0.05735

SQL Call

SELECT * FROM TD_XGBoost (
ON diabetes_sample PARTITION BY ANY
OUT TABLE MetaInformationTable(xgb_out)
USING
ResponseColumn('response')
InputColumns('[2:4]')
MaxDepth(3)
MinNodeSize(1)
NumParallelTrees(2)
ModelType('CLASSIFICATION')
Seed(1)
RegularizationLambda(1)
LearningRate(0.5)
NumBoostRounds(2)
MinImpurity(0)
ColumnSampling(1.0) 
) AS dt;

Output

task_index tree_num iter class_num tree_order tree
---------- -------- ---- --------- ---------- ----
0           1        1    0         0          {"id_":1,"sum_":-2.000000,"sumSq_":1.000000,"size_":4,"maxDepth_":0,"value_":-0.500000,"nodeType_":"REGRESSION_LEAF","prediction_":-0.500000}
0           1        2    0         0          {"id_":1,"sum_":-1.510163,"sumSq_":0.570148,"size_":4,"maxDepth_":0,"value_":-0.377541,"nodeType_":"REGRESSION_LEAF","prediction_":-0.389214}
0           2        1    0         0          {"id_":1,"sum_":1.000000,"sumSq_":1.000000,"size_":4,"maxDepth_":3,"nodeType_":"REGRESSION_NODE","split_":
{"splitValue_":8.000000,"attr_":"col_1","type_":"REGRESSION_NUMERIC_SPLIT","score_":0.750000,"scoreImprove_":0.750000,"leftNodeSize_":1,"rightNod
{"id_":2,"sum_":-0.500000,"sumSq_":0.250000,"size_":1,"maxDepth_":0,"value_":-0.500000,"nodeType_":"REGRESSION_LEAF","prediction_":-0.200000},"r
{"id_":3,"sum_":1.500000,"sumSq_":0.750000,"size_":3,"maxDepth_":0,"value_":0.500000,"nodeType_":"REGRESSION_LEAF","prediction_":0.428571}}
0           2        2    0         0          {"id_":1,"sum_":0.733237,"sumSq_":0.669463,"size_":4,"maxDepth_":3,"nodeType_":"REGRESSION_NODE","split_":
{"splitValue_":8.000000,"attr_":"col_1","type_":"REGRESSION_NUMERIC_SPLIT","score_":0.535054,"scoreImprove_":0.535054,"leftNodeSize_":1,"rightNod
{"id_":2,"sum_":-0.450166,"sumSq_":0.202649,"size_":1,"maxDepth_":0,"value_":-0.450166,"nodeType_":"REGRESSION_LEAF","prediction_":-0.180425},"r
{"id_":3,"sum_":1.183403,"sumSq_":0.466814,"size_":3,"maxDepth_":0,"value_":0.394468,"nodeType_":"REGRESSION_LEAF","prediction_":0.344696}}
0          -1       -1   -1        -1          {"lossType":"LOGISTIC","numBoostedTrees":2,"iterNum":2,"avgResponses":0.000000,"classMapping":{"1":0,"2":1}}

Out MetaInformation Table

task_index tree_num iter accuracy deviance
---------- -------- ---- -------- --------
0          1        1    1        0.378
0          1        2    1        0.291
0          2        1    1        0.408
0          2        2    1        0.338