The decision forest functions create a predictive model based on the algorithm for decision-tree training and prediction described in Classification and Regression Trees by Breiman, Friedman, Olshen, and Stone (1984).
Original Random Forests Algorithm
- If the number of cases in the training set is N, sample N cases at random, but with replacement from the original data. This sample becomes the training set for growing the tree.
- If there are M input variables, a number m<<M is specified such that at each node, m variables are selected at random from M and the best split on those m variables is used to split the node. The value of m is held constant during the forest growing.
- Each tree is grown to the largest extent possible. There is no pruning.
Random Forests® and RandomForests® are registered trademarks in the United States, owned by Minitab, Inc.
ML Engine Implementation
- The DecisionForest function lets you specify m using the optional syntax element Mtry. If you do not specify Mtry, the function uses all variables to train the decision tree (equivalent to bootstrap aggregating or bagging).
- The DecisionForest function randomly assigns rows to individual vworkers. Each vworker creates trees with a bootstrapping technique, using only its local data.
- The tree grows until any stopping criterion is met.
ML Engine Decision Forest functions support regression, binary, and multiple-class classification problems.
For more detailed information about ML Engine implementation of functionality like that of the Random Forests algorithm, including detailed examples, see Bagging and Random Forest in Teradata® Aster Analytics, TDN0009013.
Function | Description |
---|---|
DecisionForest (ML Engine) | Builds predictive model based on training data. |
DecisionForestPredict_MLE (ML Engine) | Uses model output by DecisionForest function to analyze input data and make predictions. |
DecisionForestEvaluator (ML Engine) | Analyzes model output by DecisionForest function and gives weights to variables used in model. Weights help you understand basis by which DecisionForestPredict_MLE function makes predictions. |
You can use the DecisionForest and DecisionForestPredict_MLE functions to create predictions input for the Receiver Operating Characteristic (ROC) (ML Engine) function.