1.1 - 8.10 - DecisionTree Input - Teradata Vantage

Teradata Vantage™ - Machine Learning Engine Analytic Function Reference

Product
Teradata Vantage
Release Number
1.1
8.10
Release Date
October 2019
Content Type
Programming Reference
Publication ID
B700-4003-079K
Language
English (United States)

Single decision trees support millions of attributes. Because the database cannot have millions of columns, you must spread the attributes across rows in the form of key-value pairs, where key is the name of the attribute and value is the value of the attribute. The Unpivoting function is useful for this purpose (see Unpivoting Example: Specified Target Columns, Default Optional Values).

Table Description
InputTable [Required if you omit AttributeTable and ResponseTable, disallowed otherwise.] Contains attribute names and values and response values.
AttributeTable [Required if you omit InputTable, disallowed otherwise.] Contains attribute names and values.
ResponseTable [Required if you omit InputTable, disallowed otherwise.] Contains response values.
SplitsTable [Optional] Contains user-specified splits.

Every attribute in AttributeTable must appear in this table.

CategoricalAttributeTable [Optional] Contains categorical attribute names.

Every categorical attribute in AttributeTable must appear in this table.

The function ignores input rows with NULL values.

InputTable Schema

Column Data Type Description
id_column Any Data point identifier. Cannot be NULL.
attribute_name_column VARCHAR Attribute name. Cannot be NULL.
attribute_value_column

Numeric attribute:

NUMERIC, INTEGER, BIGINT, or DOUBLE PRECISION

Categorical attribute:

Any

Attribute value. If NULL, function estimates value by arithmetic means on an attribute basis. If estimate is out of range, function cannot use it to partition training data, so it is useless.
response_column NUMERIC, INTEGER, BIGINT, or DOUBLE PRECISION Response value for data point. Can be NULL.
weight_column DOUBLE PRECISION [Column appears only with Weighted ('true').] Weight of data point. Cannot be NULL.
actual_label VARCHAR [Optional] Actual label of data point.

AttributeTable Schema

Column Data Type Description
id_column Any Data point identifier. Cannot be NULL.
attribute_name_column VARCHAR Attribute name. Cannot be NULL.
attribute_value_column

Numeric attribute:

NUMERIC, INTEGER, BIGINT, or DOUBLE PRECISION

Categorical attribute:

Any

Attribute value. If NULL, function estimates value by arithmetic means on an attribute basis. If estimate is out of range, function cannot use it to partition training data, so it is useless.
actual_label VARCHAR [Optional] Actual label of data point.

ResponseTable Schema

Column Data Type Description
id_column Any Data point identifier. Cannot be NULL.
response_column NUMERIC, INTEGER, BIGINT, or DOUBLE PRECISION Response value for data point. Can be NULL.
weight_column DOUBLE PRECISION [Column appears only with Weighted ('true').] Weight of data point. Cannot be NULL.
The response table must not have a column named node_id.

SplitsTable Schema

Column Data Type Description
attribute_name_column VARCHAR Attribute name. Cannot be NULL.

Every attribute in AttributeTable must appear in this table.

split_id INTEGER Split identifier. Cannot be NULL.
splits_valcol NUMERIC, INTEGER, BIGINT, or DOUBLE PRECISION Split value. Cannot be NULL.

CategoricalAttributeTable Schema

Column Data Type Description
categorical_attribute_name_column VARCHAR Categorical attribute name.

Every categorical attribute in AttributeTable must appear in this table.