DecisionTree Output - Teradata Vantage

Machine Learning Engine Analytic Function Reference

Product
Teradata Vantage
Release Number
8.00
1.0
Published
May 2019
Language
English (United States)
Last Update
2019-11-22
dita:mapPath
blj1506016597986.ditamap
dita:ditavalPath
blj1506016597986.ditaval
dita:id
B700-4003
lifecycle
previous
Product Category
Teradata Vantageā„¢
Table Description
OutputTable Contains final decision tree (model table).
SaveFinalResponseTableTo [Optional] Contains final PID and response pair from response table and node_id from final single drive tree.
IntermediateSplitsTable [Disallowed with SplitsTable, optional otherwise] Contains intermediate splits.

Output Message Schema

Column Data Type Description
message VARCHAR Reports that model table was stored in table specified by OutputTable argument and depth of tree.

OutputTable Schema

This model table has a row for each node in the model.

Column Data Type Description
node_id INTEGER Node identifier.
node_size INTEGER Number of objects in node.
node_gini[_p] DOUBLE PRECISION GINI impurity value for information in node. For ImpurityMeasurement ('gini'), column name is node_gini_p; otherwise, it is node_gini.
node_entropy[_p] DOUBLE PRECISION Entropy impurity value for the information in the node. For ImpurityMeasurement ('entropy'), column name is node_entropy_p; otherwise, it is node_entropy.
node_chisq_pv[_p] DOUBLE PRECISION Chi-square impurity value for the information in the node. For ImpurityMeasurement ('chisquare'), column name is node_chisq_pv_p; otherwise, it is node_chisq_pv.
node_label VARCHAR Output category for node.
node_majorvotes INTEGER Number of objects that belong to category identified by node_label.
split_value DOUBLE PRECISION Numeric split value.
split_gini[_p] DOUBLE PRECISION GINI impurity measurement for information in node after splitting. For ImpurityMeasurement ('gini'), column name is split_gini_p; otherwise, it is split_gini.
split_entropy[_p] DOUBLE PRECISION Entropy impurity measurement for the information in node after splitting. For ImpurityMeasurement ('entropy'), column name is split_entropy_p; otherwise, it is split_entropy.
split_chisq_pv[_p] DOUBLE PRECISION Chi-square impurity measurement for information in node after splitting. For ImpurityMeasurement ('chisquare'), column name is split_chisq_pv_p; otherwise, it is split_chisq_pv.
left_id INTEGER Identifier of left child of node.
left_size INTEGER Number of objects in left child of node.
left_label VARCHAR Output category for left child of node.
left_majorvotes INTEGER Number of objects that belong to category identified by left_label.
right_id INTEGER Identifier of right child of node.
right_size INTEGER Number of objects in right child of node.
right_label VARCHAR Output category for right child of node.
right_majorvotes INTEGER Number of objects that belong to category identified by right_label.
left_bucket VARCHAR When split value is categorical attribute, value in left child of node.
right_bucket VARCHAR When split value is categorical attribute, value in right child of node.
attribute VARCHAR Split attribute.
node_majorfreq DOUBLE PRECISION [Column appears only with Weighted ('true').] Weighted objects that belong to category identified by node_label.
left_majorfreq DOUBLE PRECISION [Column appears only with Weighted ('true').] Weighted objects that belong to category identified by left_label.
right_majorfreq DOUBLE PRECISION [Column appears only with Weighted ('true').] Weighted objects that belong to category identified by right_label.
left_label_probdist VARCHAR [Column appears only with OutputResponseProbDist ('true').] Probability of each label for left child of node.
right_label_probdist VARCHAR [Column appears only with OutputResponseProbDist ('true').] Probability of each label for right child of node.
prob_label_order VARCHAR [Column appears only with OutputResponseProbDist ('true').] Order of probability of labels for left and right children of node.

IntermediateSplitsTable Schema

Column Data Type Description
attribute VARCHAR Attribute name (from the attribute table in DecisionTree Input). For each attribute, the table has the number of rows specified by the MaxDepth argument.
percentile INTEGER Percentage of values in the split. For example, if attribute A has 100 different values, then percentile =10 and value =1 means that 100*10%=10 (the 10th value) of attribute A is 1, and 1 is the split value.
value NUMERIC, INTEGER, BIGINT, or DOUBLE PRECISION Split value (from the attribute table, DecisionTree Input).

SaveFinalResponseTableTo Schema

Column Data Type Description
node_id INTEGER Node identifier.
pid Any Data point identifier.
response NUMERIC, INTEGER, BIGINT, or DOUBLE PRECISION Response value for the data point.