XGBoost Functions - Aster Analytics

Teradata AsterĀ® Analytics Foundation User GuideUpdate 2

Product
Aster Analytics
Release Number
7.00.02
Published
September 2017
Language
English (United States)
Last Update
2018-04-17
dita:mapPath
uce1497542673292.ditamap
dita:ditavalPath
AA-notempfilter_pdf_output.ditaval
dita:id
B700-1022
lifecycle
previous
Product Category
Software

The XGBoost_Drive function trains a classification model using gradient boosting with decision trees as the base-line classifier and has a corresponding predict function, XGBoost_Predict.

In gradient boosting, each iteration fits a model to the residuals (errors) of the previous iteration. It also provides a general framework for adding a loss function and a regularization term.

The Aster Analytics implementation of the XGBoost algorithm includes:
  • Loss functions:
    • Binomial (for binary classification)
    • Softmax (for multiple-class classification)
  • L2 regularization
  • Shrinkage
  • Column subsampling

Row subsampling is implemented by randomly partitioning the input dataset among the available vworkers. By distributing the input dataset across vworkers, we train multiple gradient boosting trees in parallel, each on a subset of data. The results are combined to generate a final prediction by majority vote.

You can use the XGBoost functions to generate predictions input for the Receiver Operating Characteristic (ROC) function.

For a general description of gradient boosting, see https://statweb.stanford.edu/~jhf/ftp/trebst.pdf. For more details about the XGBoost algorithm, see http://www.kdd.org/kdd2016/papers/files/rfp0697-chenAemb.pdf.