The GMMPredict function has one output table, whose format depends on the OutputFormat argument.
The following table describes the output table for OutputFormat('sparse'), the default. The table has D+3 columns, where D is the number of dimensions of the input data.
Column Name | Data Type | Description |
---|---|---|
accumulate_column | Same as in input table | Column copied from the testdata table. Typically, one accumulate_column contains the unique identifier of a data point. |
data_point | NUMERIC | Input data point. The table has D such columns, which are copied from input_table to the output table. Their names are the same in input_table and the output table. |
cluster_rank | INTEGER | Probability that the data point belongs to the cluster. |
cluster_id | INTEGER | Identification number of the cluster |
prob | DOUBLE PRECISION | Probability that the cluster assigned to the data point |
The following table describes the output table for OutputFormat('dense'). The table has D+2k columns, where D is the number of dimensions of the input data and n is the the number of cluster weights that the function outputs (the value of the TopNClusters argument).
Column Name | Data Type | Description |
---|---|---|
accumulate_column | Same as in input table | Column copied from the testdata table. Typically, one accumulate_column contains the unique identifier of a data point. |
id | Any | Identification of the data point |
data_point | NUMERIC | Input data point. The table has D such columns, which are copied from input_table to the output table. Their names are the same in input_table and the output table. |
cluster_rank_i | INTEGER | Probability that data point i belongs to the cluster. |
cluster_id_i | INTEGER | Identification number of the cluster with the greatest probability assigned to data point i. The table has n such columns (cluster_id_1, ..., cluster_id_n). |
prob_i | DOUBLE PRECISION | Probability assigned to data point i by the cluster named in cluster_id_i. The table has n such columns (prob_1, ..., prob_n). For each n, the column prob_n immediately follows the column cluster_id_n. |