Teradata Warehouse Miner provides decision trees for classification models and regression models. They are built largely on the techniques described in [Breiman, Friedman, Olshen and Stone] and [Quinlan]. As such, splits using the Gini diversity index, regression or information gain ratio are provided. Pruning is also provided, using either the Gini diversity index or gain ratio technique. In addition to a summary report, a graphical tree browser is provided when a model is built, displaying the model either as a tree or a set of rules. Finally, a scoring function is provided to score and/or evaluate a decision tree model. The scoring function can also be used to simply generate the scoring SQL for later use.
A number of additional options are provided when building or scoring a decision tree model. One of these options is whether or not to bin numeric variables during the tree building process. Another involves including recalculated confidence measures at each leaf node in a tree based on a validation table, supplementing confidence measures based on the training data used to build the tree. Finally, at the time of scoring, a table profiling the leaf nodes in the tree can be requested, at the same time each scored row is linked with a leaf node and corresponding rule set.