TD_XGBoostPredict Input - Teradata Vantage

Teradata® VantageCloud Lake

Deployment
VantageCloud
Edition
Lake
Product
Teradata Vantage
Published
January 2023
Language
English (United States)
Last Update
2024-04-03
dita:mapPath
phg1621910019905.ditamap
dita:ditavalPath
pny1626732985837.ditaval
dita:id
phg1621910019905

InputTable Schema

Column Name Data Type Description
ID_Column Any Unique test point identifier. Cannot be NULL.
target_column(s) INTEGER, BIGINT, SMALLINT, BYTEINT, FLOAT, DECIMAL, NUMBER Column appears once for each specified target_column. Predictor variable. Cannot be NULL.
accumulate_column(s) Any Column appears once for each specified accumulate_column. Column to copy to output table.

Model Table Schema

Column Name Data Type Description
task_index SMALLINT Identifier of AMP that produced a boosting tree.
tree_num SMALLINT Identifier of boosted tree. Number of unique tree_id values depends on NumBoostedTrees syntax element value and number of AMPs.
Iter SMALLINT Iteration (boosting round) number.
class_num SMALLINT Index of class column to predict. It only appears in classification. For LossFunction ('softmax'), the default: Number of unique class_num values is number of class labels in data set. For K class labels: class_num values are the integers in range [0, K-1]. For LossFunction('binomial'): There is only one class_num value.
tree_order SMALLINT Identifier of a complete JSON order of the regression_tree/classification_tree column.
regression_tree /classification_tree VARCHAR 32000 JSON representation of decision tree. For JSON types that can appear in the representation, see the following table.

JSON Types in JSON Representation of Decision Tree

JSON Type Description
id_ Node identifier.
sum_ Appears only for regression trees. Sum of values of response variable at node identified by id.
sumSq_ Appears only for regression trees. Sum of squared values of response variable at node identified by id.
responseCounts_ Appears only for classification trees. The number of observations in each class at a node, identified by id.
size_ The total number of observations at a node, identified by id.
maxDepth_ Maximum possible depth of the tree, starting from node identified by id. For root node, the value is max_depth; for leaf nodes, 0; for other nodes, maximum possible depth of the tree, starting from that node.
split_ Start of JSON item describing a split at node identified by id.
splitValue_ The attribute value used for splitting a tree node.
score_ Gini score of the node identified by id.
attr_ Attribute (predictor) on which algorithm split at node identified by id.
type_ Type of tree and split. Possible values:
  • REGRESSION_NUMERIC_SPLIT
leftNodeSize_ The number of observations assigned to the left node in the split.
rightNodeSize_ The number of observations assigned to the right node in the split.
leftChild_ Start of JSON item describing left child of a node, identified by id.
rightChild_ Start of JSON item describing right child of a node, identified by id.
nodeType_ Type of a node identified by id. Possible values:
  • REGRESSION_NODE
  • REGRESSION_LEAF

JSON Types in JSON Representation of Region Prediction

JSON Type Description
id Region identifier.
value The value in the region.