Difference between the Decision Tree Approaches - Aster Analytics

Teradata Aster Analytics Foundation User Guide

Product
Aster Analytics
Release Number
6.21
Published
November 2016
Language
English (United States)
Last Update
2018-04-14
dita:mapPath
kiu1466024880662.ditamap
dita:ditavalPath
AA-notempfilter_pdf_output.ditaval
dita:id
B700-1021
lifecycle
previous
Product Category
Software

In a random forest, each vworker operates on its own data and builds one or more decision trees. During the forest-building process, vworkers need not communicate with each other.

A forest is built based on the training data set. After the forest is built, all the future data points are predicted against all the trees in the forest, and then the function calculates an aggregate value for the global predicted value. Because many trees are involved, it is not clear which variables are the most important at the different levels of the trees.

The single-tree approach requires vworkers to communicate with each other during the tree-building process. This communication can be very expensive, depending on the number of variables and the number of possible splits; therefore, the single-tree algorithm uses a sampling approach to reduce the number of splits.

The single-tree approach implements the classification tree for numeric and categorical variables.

The Single_Tree_Drive function uses Approximate Percentile and Percentile for sampling the split values. The split table has all the splits for all the numerical attributes to be considered for building the single decision tree.