KMeans Output - Teradata Vantage

Machine Learning Engine Analytic Function Reference

Product
Teradata Vantage
Release Number
8.10
1.1
Published
October 2019
Language
English (United States)
Last Update
2019-12-31
dita:mapPath
ima1540829771750.ditamap
dita:ditavalPath
jsj1481748799576.ditaval
dita:id
B700-4003
lifecycle
previous
Product Category
Teradata Vantage™
Output Description
Results message Contains information about each cluster.
OutputTable Contains cluster centroids. Schema depends on UnpackColumns syntax element.
ClusterAssignmentTable [Optional] Contains clusters themselves.

Results Message Schema

Column Data Type Description
clusterid INTEGER Cluster identifier of centroid.
mean VARCHAR Concatenation of means in centroid. For example, means 3, 5, and 6 are represented as '3 5 6'.

The UnpackColumns syntax element does not affect this column.

size INTEGER Number of points in cluster.
withinss INTEGER Within-cluster-sum-of-squares—sum of squared differences of each point from its cluster centroid.

After the information described by the preceding schema, the results message has the following information:

Label Value
Converged : 'True' if algorithm converged, 'False' otherwise.
Number of iterations : Number of iterations algorithm performed.
Number of clusters : Number of clusters.
Successfully created Output table  
Successfully created Clustered Output table [Column appears only with ClusterAssignmentTable.]
Total_WithinSS : Sum of withinss values in preceding table.
Between_SS : Between sum of squares—sum of squared distances of centroids to global mean, where squared distance of each mean to global mean is multiplied by number of data points it represents.

OutputTable Schema, UnpackColumns ('false') (Default)

Column Data Type Description
clusterid INTEGER Cluster identifier of centroid.
mean VARCHAR Concatenation of means in centroid. For example, means 3, 5, and 6 are represented as '3 5 6'.
size INTEGER Number of points in cluster.
withinss INTEGER Within-cluster-sum-of-squares—sum of squared differences of each point from its cluster centroid.

OutputTable Schema, UnpackColumns ('true')

Column Data Type Description
clusterid INTEGER Cluster identifier of centroid.
feature_i INTEGER or VARCHAR [Column appears once for each feature.] Mean for feature i (name of InputTable column).
size INTEGER Number of points in cluster.
withinss INTEGER Within-cluster-sum-of-squares—sum of squared differences of each point from its cluster centroid.

ClusterAssignmentTable Table Schema

Column Data Type Description
pointid INTEGER Identifier of user or item (id from InputTable).
centroidid INTEGER Identifier of centroid for pointid.