TD_DecisionForestPredict Input - Teradata Vantage

Teradata® VantageCloud Lake

Deployment
VantageCloud
Edition
Lake
Product
Teradata Vantage
Published
January 2023
Language
English (United States)
Last Update
2024-04-03
dita:mapPath
phg1621910019905.ditamap
dita:ditavalPath
pny1626732985837.ditaval
dita:id
phg1621910019905
TD_DecisionForestPredict uses the following input tables.
Table Description
InputTable Contains test data, for which to predict outcomes. The input table can have no partition or PARTITION BY ANY clause.
ModelTable Has the same schema as the output table of TD_DecisionForest function. Model table must be a DIMENSION table, and must be from TD_DecisionForest function.

InputTable Schema

Column Data Type Description
ID_Column Varies Unique test point identifier. Cannot be NULL.
target_columns INTEGER, BIGINT, SMALLINT, BYTEINT, FLOAT, DECIMAL, NUMBER Predictor variable. Column appears once for each specified target_column. Cannot be NULL.
accumulate_columns Varies Column to copy to output table. Column appears once for each specified accumulate_column.

ModelTable Schema

For Classification
Name Data Type Description
task_index SMALLINT Identifier of AMP that produced decision tree.
tree_num Integer Decision tree identifier.
tree_order Integer Sequence of substring of tree.
classification_tree VARCHAR(16000) JSON representation of decision tree. For JSON types that can appear in the representation.
For Regression:
Name Data Type Description
task_index SMALLINT Identifier of AMP that produced decision tree.
tree_num Integer Decision tree identifier.
tree_order Integer Sequence of substring of tree.
regression_tree VARCHAR(16000) JSON representation of decision tree. For JSON types that can appear in the representation.
JSON Types in JSON Representation of Decision Tree
JSON Type Description
id_ Node identifier.
sum_ Sum of values of response variable at node identified by id. Only appears for regression trees.
sumSq_ Sum of squared values of response variable at node identified by id. Only appears for regression trees.
responseCounts_ Number of observations in each class at node identified by id. Only appears for regression trees.
size_ Total number of observations at node identified by id.
maxDepth_ Maximum possible depth of tree, starting from node identified by id. For root node, the value is max_depth. For leaf nodes, the value is 0. For other nodes, maximum possible depth of tree, starting from that node.
split_ Start of JSON item describing a split at node identified by id.
score_ Gini score of node identified by id.
attr_ Attribute (predictor) that the algorithm split at node identified by id.
type_ Type of tree and split. Options are:
  • CLASSIFICATION_NUMERIC_SPLIT
  • REGRESSION_NUMERIC_SPLIT
leftNodeSize_ Number of observations assigned to left node of split.
rightNodeSize_ Number of observations assigned to right node of split.
leftChild_ Start of JSON item describing left child of node identified by id.
rightChild_ Start of JSON item describing right child of node identified by id.
nodeType_ Type of node identified by id. Options are:
  • CLASSIFICATION_NODE
  • CLASSIFICATION_LEAF
  • REGRESSION_NODE
  • REGRESSION_LEAF