- OutputTable
- Specify the name of the table in which to output the centroids of the clusters.
- NumClusters
- [Required if you omit InitialSeedTable, disallowed otherwise.] Specify the number of clusters. If you specify a single value, the function trains a single model with the specified number of clusters. If you specify multiple values, the function trains a model for each value.
- ModelIDColumn
- [Optional] Specify the name of the InitialSeedTable column that contains seed values for multiple models.
- TargetColumns
- Specify the input table columns to use for clustering.
- StopThreshold
- [Optional] Specify the convergence threshold. When the centroids move by less than threshold, the algorithm has converged. The threshold must be a nonnegative DOUBLE value.
- MaxIterNum
- [Optional] Specify the maximum number of iterations that the algorithm runs before quitting if the convergence threshold is not met. The max_iterations must be a positive INTEGER.
- NumericDistanceMethod
- [Optional] Specify the distance metric for numeric dimensions.
- CategoricalDistanceMethod
- [Optional] Specify the distance metric for categorical dimensions:
Option Description overlap (Default) Distance is 0 if two points are in same category, 1 otherwise. hamming Used for categories that are strings of equal length. Percentage of different characters. - CategoryWeights
- [Optional] Specify the weight of each category in the KModes distance. Each weight must be a DOUBLE value.
- NumericAsCategorical
- [Optional] Specify the input table columns that contain numeric variables to interpret as categorical variables. These columns must have numeric SQL data types.
- Seed
- [Optional] Specify the random seed the algorithm uses for repeatable results. The seed must be a LONG value.
- SeedColumn
- [Optional] Specify the names of the InputTable columns by which to partition the input. Function calls that use the same input data, seed, and seed_column output the same result. If you specify SeedColumn, you must also specify Seed.