Decision Tree Basics - Aster Analytics

Teradata Aster Analytics Foundation User Guide

Product

Aster Analytics

Release Number

6.21

Published

November 2016

Language

English (United States)

Last Update

2018-04-14

dita:mapPath

kiu1466024880662.ditamap

dita:ditavalPath

AA-notempfilter_pdf_output.ditaval

dita:id

B700-1021

lifecycle

Product Category

Software

Decision trees are very simple models. For example, suppose you want to predict the value of a variable, y, and you have two predictor variables, x1 and x2. You want to model y as a function of x1 and x2 (y = f(x1, x2)).

You can visualize x1 and x2 as forming a plane, and values of y at particular coordinates of (x1, x2) rising out of the plane in the third dimension. A decision tree partitions the plane into rectangles and assigns each partition to predict a constant value of y, which is usually the average value of all the y values in that region. You can extend this two-dimensional example into arbitrarily many dimensions to fit models with large numbers of predictors.

In this example, the x1-x2 plane has four regions, R1, R2, R3 and R4. The predicted value of y for any test observation in R1 is the average value of y for all training observations in R1.

This information can be represented by a decision tree:

The algorithm starts at the Root node. If the x1 value for a data point is greater than 5, then the algorithm travels down the right path; if the value of x1 is less than 5, then the algorithm travels down the left path. At each subsequent node, the algorithm determines which branch to follow, until it reaches a leaf node, to which it assigns a prediction value.