Background - Aster Analytics

Teradata Aster Analytics Foundation User Guide

Product
Aster Analytics
Release Number
6.21
Published
November 2016
Language
English (United States)
Last Update
2018-04-14
dita:mapPath
kiu1466024880662.ditamap
dita:ditavalPath
AA-notempfilter_pdf_output.ditaval
dita:id
B700-1021
lifecycle
previous
Product Category
Software

Boosting is a technique that develops a strong classifying algorithm from a collection of weak classifying algorithms. A classifying algorithm is weak if its correct classification rate is slightly better than random guessing (which is 50% for binary classification). The intuition behind boosting is that combining a set of predictions, each of which has more than 50% probability of being correct, can produce an arbitrarily accurate predictor function.

The AdaBoost algorithm (described by J. Zhu, H. Zou, S. Rosset and T. Hastie 2009 in https://web.stanford.edu/~hastie/Papers/samme.pdf) is iterative. It starts with a weak classifying algorithm, and each iteration gives higher weights to the data points that the previous iteration classified incorrectly—a technique called Adaptive Boosting, for which the AdaBoost algorithm is named. AdaBoost constructs a strong classifier as a linear combination of weak classifiers.

The AdaBoost_Drive function uses a single decision tree as the initial weak classifying algorithm.

Boosting can be very sensitive to noise in the data. Because weak classifiers are likely to incorrectly classify outliers, the algorithm weights outliers more heavily with each iteration, thereby increasing their influence on the final result.

The boosting process is:

  1. Train on a data set, using a weak classifier. (For the first iteration, all data points have equal weight.)
  2. Calculate the weighted training error.
  3. Calculate the weight of the current classifier to use in the final calculation (step 6).
  4. Update the weights for the next iteration by decreasing the weights of the correctly classified data points and increasing the weights of the incorrectly classified data points.
  5. Repeat steps 1 through 4 for each weak classifier.
  6. Calculate the strong classifier as a weighted vote of the weak classifiers, using the weights calculated in step 3.

Mathematically:

  1. Assume that the training set has n data points classified into K classes:

    (x 1, y 1), (x 2, y 2), ..., (x n, y n) where y i is an element of {1, 2, ..., K}

  2. In the first iteration, assign the same weight to each data point:

    w i(i) = 1/n

  3. For each of the T weak classifiers:
    1. Fit classifier h t(x) to the training data, using weights w t.
    2. Calculate the error rate for the classifier:


    3. Calculate the weight of the classifier h t:

      α t = log ((1-err i ) / err i ) + log (K - 1)

    4. Update the weights w t+ 1, increasing the weights of the data points that h t classified incorrectly:


      for all i = 1, 2, ..., n

  4. Calculate the strong classifier: