Output - Aster Analytics

Teradata AsterĀ® Analytics Foundation User GuideUpdate 2

Product
Aster Analytics
Release Number
7.00.02
Published
September 2017
Language
English (United States)
Last Update
2018-04-17
dita:mapPath
uce1497542673292.ditamap
dita:ditavalPath
AA-notempfilter_pdf_output.ditaval
dita:id
B700-1022
lifecycle
previous
Product Category
Software

The Single_Tree_Drive function outputs console messages, a model table, and (optionally) an intermediate splits table and final response table. The following table shows the schema of the message table.

Single_Tree_Drive Console Message Table Schema
Column Data Type Description
message VARCHAR Console message.

The model table has a row for each node in the model (the single decision tree that the function creates). The name of the model table is specified by the OutputTableName argument. The following table shows the schema of the model table.

Single_Tree_Drive Model Table Schema
Column Data Type Description
node_id INTEGER Node identifier.
node_size INTEGER Number of objects in the node.
node_gini[(p)] DOUBLE PRECISION GINI impurity value for the information in the node. If you specify ImpurityMeasurement('gini'), the column name is node_gini(p); otherwise, it is node_gini.
node_entropy[(p)] DOUBLE PRECISION Entropy impurity value for the information in the node. If you specify ImpurityMeasurement('entropy'), the column name is node_entropy(p); otherwise, it is node_entropy.
node_chisq_pv[(p)] DOUBLE PRECISION Chi-square impurity value for the information in the node. If you specify ImpurityMeasurement('chisquare'), the column name is node_chisq_pv(p); otherwise, it is node_chisq_pv.
node_label VARCHAR Output category for the node.
node_majorvotes INTEGER Number of objects that belong to the category identified by node_label.
split_value DOUBLE PRECISION Numerical split value.
split_gini[(p)] DOUBLE PRECISION GINI impurity measurement for the information in the node after splitting. If you specify ImpurityMeasurement('gini'), the column name is split_gini(p); otherwise, it is split_gini.
split_entropy[(p)] DOUBLE PRECISION Entropy impurity measurement for the information in the node after splitting. If you specify ImpurityMeasurement('entropy'), the column name is split_entropy(p); otherwise, it is split_entropy.
split_chisq_pv[(p)] DOUBLE PRECISION Chi-square impurity measurement for the information in the node after splitting. If you specify ImpurityMeasurement('chisquare'), the column name is split_chisq_pv(p); otherwise, it is split_chisq_pv.
left_id INTEGER Identifier of the left child of the node.
left_size INTEGER Number of objects in left child of the node.
left_label VARCHAR Output category for left child of the node.
left_majorvotes INTEGER Number of objects that belong to the category identified by left_label.
right_id INTEGER Identifier of the right child of the node.
right_size INTEGER Number of objects in right child of the node.
right_label VARCHAR Output category for right child of the node.
right_majorvotes INTEGER Number of objects that belong to the category identified by right_label.
left_bucket VARCHAR When the split value is the categorical attribute, the value in the left child of the node.
right_bucket VARCHAR When the split value is the categorical attribute, the value in the right child of the node.
left_label_probdist VARCHAR Output probability of each label for left child of the node. This column appears only if OutputResponseProbDist has the value 'true'.
right_label_probdist VARCHAR Output probability of each label for right child of the node. This column appears only if OutputResponseProbDist has the value 'true'.
prob_label_order VARCHAR Output the label order of probability for the left and right children of the node. This column appears only if OutputResponseProbDist has the value 'true'.
attribute VARCHAR Split attribute.
node_majorfreq DOUBLE PRECISION Weighted objects that belong to the category identified by node_label. This column appears only if the Weighted argument is 'true'.
left_majorfreq DOUBLE PRECISION Weighted objects that belong to the category identified by left_label. This column appears only if the Weighted argument is 'true'.
right_majorfreq DOUBLE PRECISION Weighted objects that belong to the category identified by right_label. This column appears only if the Weighted argument is 'true'.

The following table describes the intermediate splits table. The name of the intermediate splits table is specified by the MaterializedSplitsTableWithName argument.

Single_Tree_Drive Intermediate Splits Table Schema
Column Data Type Description
attribute VARCHAR Attribute name (from the attribute table, Input). For each attribute, the table has the number of rows specified by the MaxDepth argument.
percentile INTEGER Percentage of values in the split. For example, if attribute A has 100 different values, then percentile =10 and value =1 means that 100*10%=10 (the 10th value) of attribute A is 1, and 1 is the split value.
value NUMERIC, INTEGER, BIGINT, or DOUBLE PRECISION Split value (from the attribute table, Input).

The following table describes the output response table. The name of the output response table is specified by the SaveFinalResponseTableTo argument.

Single_Tree_Drive Output Response Table Schema
Column Data Type Description
node_id INTEGER Node identifier.
pid Any Data point identifier.
response NUMERIC, INTEGER, BIGINT, or DOUBLE PRECISION Response value for the data point.