DecisionForest - Teradata Vantage

Machine Learning Engine Analytic Function Reference

Product
Teradata Vantage
Release Number
8.00
1.0
Published
May 2019
Language
English (United States)
Last Update
2019-11-22
dita:mapPath
blj1506016597986.ditamap
dita:ditavalPath
blj1506016597986.ditaval
dita:id
B700-4003
lifecycle
previous
Product Category
Teradata Vantage™

The DecisionForest function uses a training data set to create a predictive model. You can input the model to the Forest_Predict function, which uses it to make predictions.

The size of each individual decision tree output by the DecisionForest function must be less than 32 MB. The factors that affect the size of a decision tree are the depth of the tree, the number of categorical inputs, the number of numerical inputs, and the number of surrogates. If the size of a decision tree exceeds 32 MB, the function issues an error message. Therefore, control the factors in the input data that increase the size of decision trees.

The ML Engine provides a tree_size_estimator function that you can use to estimate maximum values for the arguments TreeSize and NumTrees, based on the cluster configuration and the number of predictor variables. This is the syntax of the tree_size_estimator function:

SELECT * FROM tree_size_estimator
  (ON inputtable NumericInputs ('predictor' [,…]));

The query results includes a row_count column. The average value of this column is the recommended maximum value for the argument NumTrees.