7.00.02 - Output - Aster Analytics

Teradata Aster® Analytics Foundation User GuideUpdate 2

Product
Aster Analytics
Release Number
7.00.02
Published
September 2017
Content Type
Programming Reference
User Guide
Publication ID
B700-1022-700K
Language
English (United States)
Last Update
2018-04-17

The KMeans function has two required outputs and one optional output. The required outputs are the results message (output to the screen) and the table of cluster centroids (specified by the OutputTable argument). The optional output is a table of the clusters themselves (specified by the ClusteredOutput argument).

The results message table starts with information about each cluster, described by the following two tables.

KMeans Results Message Table Schema
Column Name Data Type Description
clusterid INTEGER Contains the cluster identifiers of the centroids.
feature_set VARCHAR Column name is the concatenation of the feature names. For example, if the feature names are 'p1', 'p2', and 'p3', then the column name is 'p1 p2 'p3'.

Contains the concatenation of the means in the centroid. For example, means 3, 5, and 6 are represented as '3 5 6'.

The UnpackColumns argument does not affect this column.
size INTEGER Contains the number of points in the cluster.
withinss INTEGER Contains the within-cluster-sum-of-squares—the sum of squared differences of each point from its cluster centroid.
KMeans Results Messages
Label Value
Converged : 'True' if the algorithm converged, 'False' otherwise.
Number of iterations : Number of iterations that the algorithm performed.
Number of clusters : Number of clusters.
Output table : Name of the output table specified by the OutputTable argument.
Total_WithinSS : Sum of withinss values in the preceding table.
Between_SS : Between sum of squares—the sum of squared distances of centroids to the global mean, where the squared distance of each mean to the global mean is multiplied by the number of data points it represents.

The schema of the table of cluster centroids is affected by the UnpackColumns argument.

KMeans Output Table Schema, UnpackColumns('false') (Default)
Column Name Data Type Description
clusterid INTEGER Contains the cluster identifiers of the centroids.
feature_set VARCHAR Column name is the concatenation of the feature names. For example, if the feature names are 'p1', 'p2', and 'p3', then the column name is 'p1 p2 'p3'.

Contains the concatenation of the means of the features in the centroid. For example, means 3, 5, and 6 are represented as '3 5 6'.

size INTEGER Contains the number of points in the cluster.
withinss INTEGER Contains the within-cluster-sum-of-squares—the sum of squared differences of each point from its cluster centroid.
KMeans Output Table Schema, UnpackColumns('true')
Column Name Data Type Description
clusterid INTEGER Contains the cluster identifiers of the centroids.
feature_i INTEGER or VARCHAR Contains the means for feature i. The table has one such column for each feature.
size INTEGER Contains the number of points in the cluster.
withinss INTEGER Contains the within-cluster-sum-of-squares—the sum of squared differences of each point from its cluster centroid.

The following table describes the optional table of the clusters themselves.

KMeans Clustered Output Table Schema
Column Name Data Type Description
pointid INTEGER Contains the identifier of the user or item (from input_table).
centroidid INTEGER Contains the identifier of the centroid for pointid.