The following elements are optional when using TD_DecisionForest:
- MaxDepth
- Specify the maximum depth of a tree. The algorithm stops splitting a node beyond this depth. Decision trees can grow to 2(max_depth+1)-1 nodes. You must specify a non-negative integer value.
- MinNodeSize
- Specify the minimum number of observations in a tree node. The algorithm stops splitting a node if the number of observations in the node is equal to or smaller than this value. You must specify a non-negative integer value.
- NumTrees
- Specify the number of trees for the forest model. You must specify a value greater than or equal to the number of data AMPs. By default, the function builds the minimum number of trees that provides the specified coverage level in the CoverageFactor argument for the input dataset.Maximum number of supported trees is 65536.
- ModelType
- Specify whether the analysis is a regression (continuous response variable) or a multiple-class classification (predicting result from the number of classes).
- TreeSize
- Specify the number of rows that each tree uses as its input dataset. The function builds a tree using either the number of rows on an AMP, the number of rows that fit into the AMP’s memory (whichever is less), or the number of rows given by the TreeSize argument. By default, this value is the minimum number of rows on an AMP and the number of rows that fit into the AMP’s memory.
- CoverageFactor
- Specify the level of coverage for the dataset in the forest. The value is specified in percentage. The default coverage value is 1.0 (100%).
- Seed
- Specify the random seed the algorithm uses for repeatable results.
- Mtry
- Specify the number of features from input columns for evaluating the best split of a node. A higher value improves the splitting and performance of a tree. A smaller value improves the robustness of the forest and prevents it from overfitting. When the value is -1, all variables are used for each split.
- MtrySeed
- Specify the random seed that the algorithm uses for the Mtry argument.
- MinImpurity
- Specify the minimum impurity of a tree node. The algorithm stops splitting a node if the value is equal to or smaller than the specified value.