DecisionForest (ML Engine) - Teradata Vantage

Machine Learning Engine Analytic Function Reference

Product
Teradata Vantage
Release Number
8.10
1.1
Published
October 2019
Language
English (United States)
Last Update
2019-12-31
dita:mapPath
ima1540829771750.ditamap
dita:ditavalPath
jsj1481748799576.ditaval
dita:id
B700-4003
lifecycle
previous
Product Category
Teradata Vantage™

The DecisionForest function uses a training data set to create a predictive model. You can input the model to the DecisionForestPredict_MLE (ML Engine) function, which uses it to make predictions.

The size of each individual decision tree output by the DecisionForest function must be less than 32 MB. The factors that affect the size of a decision tree are the depth of the tree, the number of categorical inputs, and the number of numerical inputs. If the size of a decision tree exceeds 32 MB, the function issues an error message. Therefore, control the factors in the input data that increase the size of decision trees.

ML Engine provides a tree_size_estimator function that you can use to estimate maximum values for the syntax elements TreeSize and NumTrees, based on the cluster configuration and the number of predictor variables. This is the syntax of the tree_size_estimator function:

SELECT * FROM tree_size_estimator@coprocessor
  (ON inputtable NumericInputs ('predictor' [,…]));

The query results includes a row_count column. The average value of this column is the recommended maximum value for the syntax element NumTrees.