Optional Syntax Elements for TD_DecisionForest - Analytics Database

Database Analytic Functions

Deployment
VantageCloud
VantageCore
Edition
Enterprise
IntelliFlex
VMware
Product
Analytics Database
Release Number
17.20
Published
June 2022
Language
English (United States)
Last Update
2024-10-04
dita:mapPath
gjn1627595495337.ditamap
dita:ditavalPath
ayr1485454803741.ditaval
dita:id
jmh1512506877710
Product Category
Teradata Vantage™

The following elements are optional when using TD_DecisionForest:

MaxDepth
Specify the maximum depth of a tree. The algorithm stops splitting a node beyond this depth. Decision trees can grow to 2(max_depth+1)-1 nodes. You must specify a non-negative integer value.
Default value: 5
MinNodeSize
Specify the minimum number of observations in a tree node. The algorithm stops splitting a node if the number of observations in the node is equal to or smaller than this value. You must specify a non-negative integer value.
Default value: 1.
NumTrees
Specify the number of trees for the forest model. You must specify a value greater than or equal to the number of data AMPs. By default, the function builds the minimum number of trees that provides the specified coverage level in the CoverageFactor argument for the input dataset.
Maximum number of supported trees is 65536.
Default value: -1.
ModelType
Specify whether the analysis is a regression (continuous response variable) or a multiple-class classification (predicting result from the number of classes).
Allowed values: Regression or Classification.
When using classification, the TD_DecisionForest function generates tree models only when more than one class is distributed to the AMP. If the AMP only has one class, no tree will be built.
A maximum of 500 classes is supported for classification.
Default value: Regression.
TreeSize
Specify the number of rows that each tree uses as its input dataset. The function builds a tree using either the number of rows on an AMP, the number of rows that fit into the AMP’s memory (whichever is less), or the number of rows given by the TreeSize argument. By default, this value is the minimum number of rows on an AMP and the number of rows that fit into the AMP’s memory.
Default value: -1.
CoverageFactor
Specify the level of coverage for the dataset in the forest. The value is specified in percentage. The default coverage value is 1.0 (100%).
Seed
Specify the random seed the algorithm uses for repeatable results.
Default value: 1.
Mtry
Specify the number of features from input columns for evaluating the best split of a node. A higher value improves the splitting and performance of a tree. A smaller value improves the robustness of the forest and prevents it from overfitting. When the value is -1, all variables are used for each split.
Default value: -1.
MtrySeed
Specify the random seed that the algorithm uses for the Mtry argument.
Default value: 1.
MinImpurity
Specify the minimum impurity of a tree node. The algorithm stops splitting a node if the value is equal to or smaller than the specified value.
Default value: 0.0.