- InputTable
- Specifies the name of the table that contains the features by which to cluster the data.
- OutputTable
- Specifies the name of the table in which to output the centroids of the clusters.
- ClusteredOutput
- [Optional] Specifies the name of the table in which to store the clustered output. If you omit this argument, the function does not generate a table of clustered output.
- UnpackColumns
- [Optional] Specifies whether the means for each centroid appear unpacked (that is, in separate columns) in output_table. Default: 'false' (the function concatenates the means for the centroids and outputs the result in a single VARCHAR column).
- InitialSeeds
- [Optional] Specifies the initial seed means as strings of underscore-delimited DOUBLE PRECISION values. For example, this clause initializes eight clusters in eight-dimensional space:
Means('50_50_50_50_50_50_50_50', '150_150_150_150_150_150_150_150', '250_250_250_250_250_250_250_250', '350_350_350_350_350_350_350_350', '450_450_450_450_450_450_450_450', '550_550_550_550_550_550_550_550', '650_650_650_650_650_650_650_650', '750_750_750_750_750_750_750_750')
The dimensionality of the means must match the dimensionality of the data (that is, each mean must have n numbers in it, where n is the number of input columns minus one).
Default behavior: The algorithm chooses the initial seed means randomly.
With InitialSeeds, the function uses a deterministic algorithm and the function supports up to 1596 dimensions. - NumClusters
- [Optional] Specifies the number of clusters to generate from the data.With NumClusters, the function uses a nondeterministic algorithm and the function supports up to 1543 dimensions.
- Seed
- [Optional] Specifies the seed number to randomly sample the input table rows as initial clusters. The seed is an INTEGER greater than or equal to 0.
- CentroidsTable
- [Optional] The table that contains the initial seed means for the clusters. The schema of the centroids table depends on the value of the UnpackColumns argument.With CentroidsTable, the function uses a deterministic algorithm and the function supports up to 1596 dimensions.
- Threshold
- [Optional] Specifies the convergence threshold. When the centroids move by less than this amount, the algorithm has converged. Default: 0.0395.
- MaxIterNum
- [Optional] Specifies the maximum number of iterations that the algorithm runs before quitting if the convergence threshold has not been met. Default: 10.