TD_DecisionForestPredict Input

TD_DecisionForestPredict Input - Teradata Vantage

Teradata® VantageCloud Lake

Deployment

VantageCloud

Edition

Lake

Product

Teradata Vantage

Published

January 2023

Language

English (United States)

Last Update

2024-04-03

dita:mapPath

phg1621910019905.ditamap

dita:ditavalPath

pny1626732985837.ditaval

dita:id

phg1621910019905

TD_DecisionForestPredict uses the following input tables.

Table	Description
InputTable	Contains test data, for which to predict outcomes. The input table can have no partition or PARTITION BY ANY clause.
ModelTable	Has the same schema as the output table of TD_DecisionForest function. Model table must be a DIMENSION table, and must be from TD_DecisionForest function.

Column	Data Type	Description
ID_Column	Varies	Unique test point identifier. Cannot be NULL.
target_columns	INTEGER, BIGINT, SMALLINT, BYTEINT, FLOAT, DECIMAL, NUMBER	Predictor variable. Column appears once for each specified target_column. Cannot be NULL.
accumulate_columns	Varies	Column to copy to output table. Column appears once for each specified accumulate_column.

For Classification
Name	Data Type	Description
task_index	SMALLINT	Identifier of AMP that produced decision tree.
tree_num	Integer	Decision tree identifier.
tree_order	Integer	Sequence of substring of tree.
classification_tree	VARCHAR(16000)	JSON representation of decision tree. For JSON types that can appear in the representation.

For Regression:
Name	Data Type	Description
task_index	SMALLINT	Identifier of AMP that produced decision tree.
tree_num	Integer	Decision tree identifier.
tree_order	Integer	Sequence of substring of tree.
regression_tree	VARCHAR(16000)	JSON representation of decision tree. For JSON types that can appear in the representation.

JSON Types in JSON Representation of Decision Tree
JSON Type	Description
id_	Node identifier.
sum_	Sum of values of response variable at node identified by id. Only appears for regression trees.
sumSq_	Sum of squared values of response variable at node identified by id. Only appears for regression trees.
responseCounts_	Number of observations in each class at node identified by id. Only appears for regression trees.
size_	Total number of observations at node identified by id.
maxDepth_	Maximum possible depth of tree, starting from node identified by id. For root node, the value is max_depth. For leaf nodes, the value is 0. For other nodes, maximum possible depth of tree, starting from that node.
split_	Start of JSON item describing a split at node identified by id.
score_	Gini score of node identified by id.
attr_	Attribute (predictor) that the algorithm split at node identified by id.
type_	Type of tree and split. Options are: CLASSIFICATION_NUMERIC_SPLIT REGRESSION_NUMERIC_SPLIT
leftNodeSize_	Number of observations assigned to left node of split.
rightNodeSize_	Number of observations assigned to right node of split.
leftChild_	Start of JSON item describing left child of node identified by id.
rightChild_	Start of JSON item describing right child of node identified by id.
nodeType_	Type of node identified by id. Options are: CLASSIFICATION_NODE CLASSIFICATION_LEAF REGRESSION_NODE REGRESSION_LEAF