1.0 - 8.00 - KMeans Arguments - Teradata Vantage

Teradata® Vantage Machine Learning Engine Analytic Function Reference

Teradata Vantage
Release Number
Release Date
May 2019
Content Type
Programming Reference
Publication ID
English (United States)
Specify the name of the table of cluster centroids.
Specify the name of the table of the clusters.
[Optional] Specify whether the means for each centroid appear unpacked (that is, in separate columns) in output_table.
Default: 'false' (The function concatenates the means for the centroids and outputs the result in a single VARCHAR column.)
[Optional] Specify the initial seed means as strings of underscore-delimited DOUBLE PRECISION values. For example, this clause initializes eight clusters in eight-dimensional space:
InitialSeeds ('50_50_50_50_50_50_50_50',
The dimensionality of the means must match the dimensionality of the data (that is, each mean must have n numbers in it, where n is the number of input columns minus one).
Default behavior: The algorithm chooses the initial seed means randomly.
With InitialSeeds, the function uses a deterministic algorithm and the function supports up to 1596 dimensions.
[Optional] Specify the number of clusters to create from the data.
With NumClusters, the function uses a nondeterministic algorithm and the function supports up to 1543 dimensions (for more information, see Nondeterministic Results).
[Optional] Specify the random seed the algorithm uses for repeatable results. The algorithm uses the seed to randomly sample the input table rows as initial clusters. The seed must be a nonnegative LONG value.
[Optional] Specify the convergence threshold. When the centroids move by less than this amount, the algorithm has converged.
Default: 0.0395
[Optional] Specify the maximum number of iterations that the algorithm runs before quitting if the convergence threshold has not been met.
Default: 10