KMeans Syntax Elements - Teradata Vantage

Machine Learning Engine Analytic Function Reference

Product
Teradata Vantage
Release Number
8.10
1.1
Published
October 2019
Language
English (United States)
Last Update
2019-12-31
dita:mapPath
ima1540829771750.ditamap
dita:ditavalPath
jsj1481748799576.ditaval
dita:id
B700-4003
lifecycle
previous
Product Category
Teradata Vantageā„¢
OutputTable
Specify the name of the table of cluster centroids.
ClusterAssignmentTable
Specify the name of the table of the clusters.
UnpackColumns
[Optional] Specify whether the means for each centroid appear unpacked (that is, in separate columns) in the OutputTable.
Default: 'false' (The function concatenates the means for the centroids and outputs the result in a single VARCHAR column.)
InitialSeeds
[Optional] Specify the initial seed means as strings of underscore-delimited DOUBLE PRECISION values. For example, this clause initializes eight clusters in eight-dimensional space:
InitialSeeds ('50_50_50_50_50_50_50_50',
'150_150_150_150_150_150_150_150',
'250_250_250_250_250_250_250_250',
'350_350_350_350_350_350_350_350',
'450_450_450_450_450_450_450_450',
'550_550_550_550_550_550_550_550',
'650_650_650_650_650_650_650_650',
'750_750_750_750_750_750_750_750')
The dimensionality of the means must match the dimensionality of the data (that is, each mean must have n numbers in it, where n is the number of input columns minus one).
Default behavior: The algorithm chooses the initial seed means randomly.
With InitialSeeds, the function uses a deterministic algorithm and the function supports up to 1596 dimensions.
NumClusters
[Optional] Specify the number of clusters to create from the data.
With NumClusters, the function uses a nondeterministic algorithm and the function supports up to 1543 dimensions (for more information, see Nondeterministic Results and UniqueID Syntax Element).
Seed
[Optional] Specify the random seed the algorithm uses for repeatable results. The algorithm uses the seed to randomly sample the input table rows as initial clusters. The seed must be a nonnegative LONG value.
For repeatable results, use both the Seed and UniqueID syntax elements. For more information, see Nondeterministic Results and UniqueID Syntax Element.
StopThreshold
[Optional] Specify the convergence threshold. When the centroids move by less than this amount, the algorithm has converged.
Default: 0.0395
MaxIterNum
[Optional] Specify the maximum number of iterations that the algorithm runs before quitting if the convergence threshold has not been met.
Default: 10