Single decision trees support millions of attributes. Because the database cannot have millions of columns, you must spread the attributes across rows in the form of key-value pairs, where key is the name of the attribute and value is the value of the attribute. The Unpivot function is useful for this purpose (see Example 1).
The Single_Tree_Drive function requires either an input table or both an attribute table and a response table. The function has two optional input tables, the splits table and the categorical splits table.
Column Name | Data Type | Description |
---|---|---|
id_column | Any | Data point identifier. Cannot be NULL. |
attribute_column | VARCHAR | Attribute name. Cannot be NULL. Every attribute in the attribute table must be given a non-empty partition in the splits table. |
node_column |
Numeric attribute: NUMERIC, INTEGER, BIGINT, or DOUBLE PRECISION Categorical attribute: Any |
Attribute value. Can be NULL, in which case the function estimates its value by arithmetic means on an attribute basis. If this value is out of range, the function cannot use it to partition the training data; therefore, it is useless. |
response_column | NUMERIC, INTEGER, BIGINT, or DOUBLE PRECISION | Response value for the data point. Can be NULL. |
weight_column | DOUBLE PRECISION | Weight of the data point. Cannot be NULL. This column appears only if the decision tree is weighted. |
actual_label | VARCHAR | Actual label of data point. |
Column Name | Data Type | Description |
---|---|---|
id_column | Any | Data point identifier. Cannot be NULL. |
attribute_column | VARCHAR | Attribute name. Cannot be NULL. Every attribute in the attribute table must be given a non-empty partition in the splits table. |
node_column |
Numeric attribute: NUMERIC, INTEGER, BIGINT, or DOUBLE PRECISION Categorical attribute: Any |
Attribute value. Can be NULL, in which case the function estimates its value by arithmetic means on an attribute basis. If this value is out of range, the function cannot use it to partition the training data; therefore, it is useless. |
actual_label | VARCHAR | Actual label of data point. |
Column Name | Data Type | Description |
---|---|---|
id_column | Any | Data point identifier. Cannot be NULL. |
response_column | NUMERIC, INTEGER, BIGINT, or DOUBLE PRECISION | Response value for the data point. Can be NULL. |
weight_column | DOUBLE PRECISION | Weight of the data point. Cannot be NULL. This column appears only if the decision tree is weighted. |
Column Name | Data Type | Description |
---|---|---|
attribute_column | VARCHAR | Attribute name. Cannot be NULL. Every attribute in the attribute table must be given a non-empty partition in the splits table. |
split_id | INTEGER | Split identifier. Cannot be NULL. |
splits_valcol | NUMERIC, INTEGER, BIGINT, or DOUBLE PRECISION | Split value. Cannot be NULL. |
Column Name | Data Type | Description |
---|---|---|
attribute | VARCHAR | Categorical attribute name. |