Difference between the Decision Tree Approaches

Difference between the Decision Tree Approaches - Aster Analytics

Teradata Aster Analytics Foundation User Guide

Product

Aster Analytics

Release Number

6.21

Published

November 2016

Language

English (United States)

Last Update

2018-04-14

dita:mapPath

kiu1466024880662.ditamap

dita:ditavalPath

AA-notempfilter_pdf_output.ditaval

dita:id

B700-1021

lifecycle

Product Category

Software

In a random forest, each vworker operates on its own data and builds one or more decision trees. During the forest-building process, vworkers need not communicate with each other.

A forest is built based on the training data set. After the forest is built, all the future data points are predicted against all the trees in the forest, and then the function calculates an aggregate value for the global predicted value. Because many trees are involved, it is not clear which variables are the most important at the different levels of the trees.

The single-tree approach requires vworkers to communicate with each other during the tree-building process. This communication can be very expensive, depending on the number of variables and the number of possible splits; therefore, the single-tree algorithm uses a sampling approach to reduce the number of splits.

The single-tree approach implements the classification tree for numeric and categorical variables.

The Single_Tree_Drive function uses Approximate Percentile and Percentile for sampling the split values. The split table has all the splits for all the numerical attributes to be considered for building the single decision tree.