Background - Aster Analytics

Teradata Aster® Analytics Foundation User GuideUpdate 2

Product

Aster Analytics

Release Number

7.00.02

Published

September 2017

Language

English (United States)

Last Update

2018-04-17

dita:mapPath

uce1497542673292.ditamap

dita:ditavalPath

AA-notempfilter_pdf_output.ditaval

dita:id

B700-1022

lifecycle

Product Category

Software

The AdaBoost algorithm (described by J. Zhu, H. Zou, S. Rosset and T. Hastie 2009 in https://web.stanford.edu/~hastie/Papers/samme.pdf) is iterative. It starts with a weak classifying algorithm, and each iteration gives higher weights to the data points that the previous iteration classified incorrectly—a technique called Adaptive Boosting, for which the AdaBoost algorithm is named. AdaBoost constructs a strong classifier as a linear combination of weak classifiers.

The AdaBoost_Drive function uses a single decision tree as the initial weak classifying algorithm.

The boosting process is:

Train on a data set, using a weak classifier. (For the first iteration, all data points have equal weight.)
Calculate the weighted training error.
Calculate the weight of the current classifier to use in the final calculation (step 6).
Update the weights for the next iteration by decreasing the weights of the correctly classified data points and increasing the weights of the incorrectly classified data points.
Repeat steps 1 through 4 for each weak classifier.
Calculate the strong classifier as a weighted vote of the weak classifiers, using the weights calculated in step 3.

Mathematically:

Assume that the training set has n data points classified into K classes:
(x 1, y 1), (x 2, y 2), ..., (x n, y n) where y i is an element of {1, 2, ..., K}
In the first iteration, assign the same weight to each data point:
w i(i) = 1/n
For each of the T weak classifiers:
1. Fit classifier h t(x) to the training data, using weights w t.
2. Calculate the error rate for the classifier:
3. Calculate the weight of the classifier h t:
  α t = log ((1-err i ) / err i ) + log (K - 1)
4. Update the weights w t+ 1, increasing the weights of the data points that h t classified incorrectly:
  
  for all i = 1, 2, ..., n
Calculate the strong classifier: