5.4.5 - Clustering Solution - Teradata Warehouse Miner

Teradata Warehouse Miner User Guide - Volume 3Analytic Functions

Product
Teradata Warehouse Miner
Release Number
5.4.5
Published
February 2018
Language
English (United States)
Last Update
2018-05-04
dita:mapPath
yuy1504291362546.ditamap
dita:ditavalPath
ft:empty
  • Col — This is the column number in the order the input columns were requested.
  • Table_Name — The name of the table associated with this input column.
  • Column_Name — The name of the input column used in performing the cluster analysis.
  • Cluster_Id — The cluster number that this data applies to, from 1 to the number of clusters requested.
  • Weight — This is the so-called “prior probability” that an observation would belong to this cluster, based on the percentage of observations belonging to this cluster at this stage.
  • Mean — When the Gaussian Mixture Model algorithm is selected, Mean is the weighted average of this column or variable amongst all the observations, where the weight used is the probability of inclusion in this cluster. When the K-Means algorithm is selected, Mean is the average value of this column or variable amongst the observations assigned to this cluster at this iteration of the algorithm.
  • Variance — When the Gaussian Mixture Model algorithm is selected, Variance is the weighted variance of this variable amongst all the observations, where the weight used is the probability of inclusion in this cluster. When the K-Means algorithm is selected, Variance is the variance of this variable amongst the observations assigned to this cluster at this iteration. (Variance is the square of a variable’s standard deviation, measuring in some sense how its value varies from one observation to the next).