1.1 - 8.10 - GMMPredict Output - Teradata Vantage

Teradata Vantage™ - Machine Learning Engine Analytic Function Reference

Product
Teradata Vantage
Release Number
1.1
8.10
Release Date
October 2019
Content Type
Programming Reference
Publication ID
B700-4003-079K
Language
English (United States)

The OutputTable schema depends on the OutputFormat syntax element.

OutputTable Schema, OutputFormat ('sparse') (Default)

The table has D+3 columns, where D is the number of dimensions of the input data.

Column Data Type Description
accumulate_column Same as in input table [Column appears once for each specified accumulate_column.] Column copied from InputTable. Typically, one accumulate_column contains unique data point identifier.
data_point Any numeric SQL data type [Column appears D times.] Input data point, copied from InputTable.
cluster_rank INTEGER Cluster rank, from most probable to least probable. Most probable cluster is identified as "1", next most probable as "2", and so on.
cluster_id INTEGER Cluster identification number.
prob DOUBLE PRECISION Probability that cluster is assigned to data point.

OutputTable Schema, OutputFormat ('dense')

The table has D+2k columns, where D is the number of dimensions of the input data and n is the the number of cluster weights that the function outputs (the value of the TopK syntax element).

Column Data Type Description
accumulate_column Same as in input table [Column appears once for each specified accumulate_column.] Column copied from InputTable. Typically, one accumulate_column contains unique data point identifier.
id Any Data point identifier.
data_point Any numeric SQL data type [Column appears D times.] Input data point, copied from InputTable.
cluster_id_i INTEGER [Column appears n times.] cluster_id_1 is most probable cluster for observation, cluster_id_2 is next most probable, and so on.
prob_i DOUBLE PRECISION [Column appears n times.] Probability that observation belongs to cluster_id_i.