KMeans Arguments - Teradata Vantage

Machine Learning Engine Analytic Function Reference

Product
Teradata Vantage
Release Number
8.00
1.0
Published
May 2019
Language
English (United States)
Last Update
2019-11-22
dita:mapPath
blj1506016597986.ditamap
dita:ditavalPath
blj1506016597986.ditaval
dita:id
B700-4003
lifecycle
previous
Product Category
Teradata Vantageā„¢
OutputTable
Specify the name of the table of cluster centroids.
ClusteredOutput
Specify the name of the table of the clusters.
UnpackColumns
[Optional] Specify whether the means for each centroid appear unpacked (that is, in separate columns) in output_table.
Default: 'false' (The function concatenates the means for the centroids and outputs the result in a single VARCHAR column.)
InitialSeeds
[Optional] Specify the initial seed means as strings of underscore-delimited DOUBLE PRECISION values. For example, this clause initializes eight clusters in eight-dimensional space:
InitialSeeds ('50_50_50_50_50_50_50_50',
'150_150_150_150_150_150_150_150',
'250_250_250_250_250_250_250_250',
'350_350_350_350_350_350_350_350',
'450_450_450_450_450_450_450_450',
'550_550_550_550_550_550_550_550',
'650_650_650_650_650_650_650_650',
'750_750_750_750_750_750_750_750')
The dimensionality of the means must match the dimensionality of the data (that is, each mean must have n numbers in it, where n is the number of input columns minus one).
Default behavior: The algorithm chooses the initial seed means randomly.
With InitialSeeds, the function uses a deterministic algorithm and the function supports up to 1596 dimensions.
NumClusters
[Optional] Specify the number of clusters to create from the data.
With NumClusters, the function uses a nondeterministic algorithm and the function supports up to 1543 dimensions (for more information, see Nondeterministic Results).
Seed
[Optional] Specify the random seed the algorithm uses for repeatable results. The algorithm uses the seed to randomly sample the input table rows as initial clusters. The seed must be a nonnegative LONG value.
StopThreshold
[Optional] Specify the convergence threshold. When the centroids move by less than this amount, the algorithm has converged.
Default: 0.0395
MaxIterNum
[Optional] Specify the maximum number of iterations that the algorithm runs before quitting if the convergence threshold has not been met.
Default: 10