In gradient boosting, each iteration fits a model to the residuals (errors) of the previous iteration. It also provides a general framework for adding a loss function and a regularization term.
- Loss functions:
- Binomial (for binary classification)
- Softmax (for multiple-class classification)
- L2 regularization
- Column subsampling
Row subsampling is implemented by randomly partitioning the input data set among the available vworkers. By distributing the input data set across vworkers, we train multiple gradient boosting trees in parallel, each on a subset of data. The results are combined to create a final prediction by majority vote.
You can use the XGBoost functions to create predictions input for the Receiver Operating Characteristic (ROC) function.
For a general description of gradient boosting, see https://statweb.stanford.edu/~jhf/ftp/trebst.pdf. For more details about the XGBoost algorithm, see http://www.kdd.org/kdd2016/papers/files/rfp0697-chenAemb.pdf.