- InputTable
- Specifies the name of the table that contains the input data to be clustered.
- OutputTable
- Specifies the name of the output table to which the function outputs cluster information. The table must not already exist.
- MaxClusterNum
- [Required if ClusterNum is omitted, otherwise not allowed.] Specifies the maximum number of clusters in a Dirichlet Process model and causes the function to use the DP-GMM algorithm. This value must have the data type INTEGER. Default: 20.
- ClusterNum
- [Required if MaxClusterNum is omitted, otherwise not allowed.] Specifies the number of clusters in a model and causes the function to use the basic GMM algorithm. This value must have the data type INTEGER and be greater than 0. Default: 10.
- CovarianceType
- [Optional] Specifies the covariance matrix type, thereby determining how many parameters the function estimates for each cluster:
-
'diagonal' (Default):
Each covariance matrix has zeros on the nondiagonal. The function estimates D parameters for each cluster, where D is the number of dimensions in the matrix.
-
'spherical':
Each covariance matrix is of the form σI. The function estimates one parameter for each cluster.
-
'tied':
Each cluster has the same covariance matrix. The function estimates (1/2)D(D-1) parameters.
-
'full':
Each cluster has an arbitrary covariance matrix. The function estimates (1/2)D(D-1) parameters for each cluster.
-
'diagonal' (Default):
- Tolerance
- [Optional] Specifies the minimum change in log-likelihood between iterations that causes the function to terminate. This value must have the data type DOUBLE PRECISION and be greater than 0. Default: 0.001.
- MaxIterNum
- [Optional] Specifies the maximum number of iterations for which the function runs. This value must have the data type INTEGER and be greater than 0. Default: 10.
- ConcentrationParam
-
[Optional] Specify this argument only if you specify MaxClusterNum. Specifies the concentration parameter, α, which determines the number of clusters that the DP-GMM algorithm generates. This value must have the data type DOUBLE PRECISION and be greater than 0.
The expected number of clusters is α log N, where N is the number of points in the data set; therefore, a larger α value tends to cause the algorithm to find more clusters.
Default: 0.001.
- PackOutput
- [Optional] Specifies whether the function packs the output. Default: 'false'.