Clustering Solution - Teradata Warehouse Miner

Teradata Warehouse Miner User Guide - Volume 3Analytic Functions

Product

Teradata Warehouse Miner

Release Number

5.4.5

Published

February 2018

Language

English (United States)

Last Update

2018-05-04

dita:mapPath

yuy1504291362546.ditamap

dita:ditavalPath

ft:empty

dita:id

B035-2302

Product Category

Software

Col — This is the column number in the order the input columns were requested.
Table_Name — The name of the table associated with this input column.
Column_Name — The name of the input column used in performing the cluster analysis.
Cluster_Id — The cluster number that this data applies to, from 1 to the number of clusters requested.
Weight — This is the so-called “prior probability” that an observation would belong to this cluster, based on the percentage of observations belonging to this cluster at this stage.
Mean — When the Gaussian Mixture Model algorithm is selected, Mean is the weighted average of this column or variable amongst all the observations, where the weight used is the probability of inclusion in this cluster. When the K-Means algorithm is selected, Mean is the average value of this column or variable amongst the observations assigned to this cluster at this iteration of the algorithm.
Variance — When the Gaussian Mixture Model algorithm is selected, Variance is the weighted variance of this variable amongst all the observations, where the weight used is the probability of inclusion in this cluster. When the K-Means algorithm is selected, Variance is the variance of this variable amongst the observations assigned to this cluster at this iteration. (Variance is the square of a variable’s standard deviation, measuring in some sense how its value varies from one observation to the next).