KMeans Arguments - Aster Analytics

Teradata AsterĀ® Analytics Foundation User GuideUpdate 2

Product
Aster Analytics
Release Number
7.00.02
Published
September 2017
Language
English (United States)
Last Update
2018-04-17
dita:mapPath
uce1497542673292.ditamap
dita:ditavalPath
AA-notempfilter_pdf_output.ditaval
dita:id
B700-1022
lifecycle
previous
Product Category
Software
InputTable
Specifies the name of the table that contains the features by which to cluster the data.
OutputTable
Specifies the name of the table in which to output the centroids of the clusters.
ClusteredOutput
[Optional] Specifies the name of the table in which to store the clustered output. If you omit this argument, the function does not generate a table of clustered output.
UnpackColumns
[Optional] Specifies whether the means for each centroid appear unpacked (that is, in separate columns) in output_table. Default: 'false' (the function concatenates the means for the centroids and outputs the result in a single VARCHAR column).
InitialSeeds
[Optional] Specifies the initial seed means as strings of underscore-delimited DOUBLE PRECISION values. For example, this clause initializes eight clusters in eight-dimensional space:
Means('50_50_50_50_50_50_50_50',
'150_150_150_150_150_150_150_150',
'250_250_250_250_250_250_250_250',
'350_350_350_350_350_350_350_350',
'450_450_450_450_450_450_450_450',
'550_550_550_550_550_550_550_550',
'650_650_650_650_650_650_650_650',
'750_750_750_750_750_750_750_750')

The dimensionality of the means must match the dimensionality of the data (that is, each mean must have n numbers in it, where n is the number of input columns minus one).

Default behavior: The algorithm chooses the initial seed means randomly.

With InitialSeeds, the function uses a deterministic algorithm and the function supports up to 1596 dimensions.
NumClusters
[Optional] Specifies the number of clusters to generate from the data.
With NumClusters, the function uses a nondeterministic algorithm and the function supports up to 1543 dimensions.
Seed
[Optional] Specifies the seed number to randomly sample the input table rows as initial clusters. The seed is an INTEGER greater than or equal to 0.
CentroidsTable
[Optional] The table that contains the initial seed means for the clusters. The schema of the centroids table depends on the value of the UnpackColumns argument.
With CentroidsTable, the function uses a deterministic algorithm and the function supports up to 1596 dimensions.
Threshold
[Optional] Specifies the convergence threshold. When the centroids move by less than this amount, the algorithm has converged. Default: 0.0395.
MaxIterNum
[Optional] Specifies the maximum number of iterations that the algorithm runs before quitting if the convergence threshold has not been met. Default: 10.