Argument | Category | Description |
---|---|---|
InputTable | Optional* | Specifies the name of the table that contains the input data set. *Required if you omit AttributeTableName and ResponseTableName. |
AttributeTableName | Optional* | Specifies the name of the table that contains the attribute names and the values. *Required if you omit InputTable. |
ResponseTableName | Optional* | Specifies the name of the table that contains the response values. *Required if you omit InputTable. |
OutputTable | Required | Specifies the name for the output table that is to contain the final decision tree (the model table). The name must not exceed 64 characters. |
AttributeNameColumns | Required | Specifies the names of the attribute table columns that define the attribute. |
AttributeValueColumn | Required | Specifies the names of the attribute table columns that define the value. |
ResponseColumn | Required | Specifies the name of the response table column that contains the response variable. |
IDColumns | Required | Specifies the names of the columns in the response and attribute tables that specify the ID of the instance. |
CategoricalAttributeTableName | Optional | Specifies the name of the input table that contains the categorical attributes. |
SaveFinalResponseTableTo | Optional | Specifies the name for the output table that is to contain the final PID and response pair from the response table and the node_id from the final single drive tree. |
SplitsTable | Optional | Specifies the name of the input table that contains the user-specified splits. By default, the function creates new splits. |
SplitsValueColumn | Optional | If you specify SplitsTableName, this argument specifies the name of the column that contains the split value. If UseApproximateSplits is 'true', then the default value is splits_valcol; if not, then the default value is the AttributeValueColumn argument, node_column. |
NumSplits | Optional | Specifies the number of splits to consider for each variable. The default value is 10. The function does not consider all possible splits for all attributes. |
ApproxSplits | Optional | Specifies whether to use approximate percentiles (true) or exact percentiles (false). The default value is true. Internally, the function uses percentile values as split values. |
IntermediateSplitsTable | Optional | Specifies the name for the intermediate splits table, if it is to be saved. By default, the function does not save the intermediate splits table. |
DropTable | Optional | Specifies whether to drop the output table (specified by OutputTableName) if it already exists. The default value is 'false'. |
MinNodeSize | Optional | Specifies the decision tree stopping criterion and the minimum size of any particular node within each decision tree. The default value is 100. |
MaxDepth | Required | Specifies a decision tree stopping criterion. If the tree reaches a depth past this value, the algorithm stops looking for splits. Decision trees can grow up to (2(max_depth+1) - 1) nodes. This stopping criteria has the greatest effect on function performance. The maximum value is 60. The default value is 5. |
Weighted | Optional | Specifies whether to build a weighted decision tree. The default value is 'false'. If you specify 'true', then you must also specify the WeightColumn argument. |
WeightColumn | Optional | Specifies the name of the response table column that contains the weights of the attribute values. |
SplitMeasure | Optional | Specifies the impurity measurement to use while constructing the decision tree. The default value is 'gini'. If the tree is weighted, this value cannot be 'chisquare'. |