7.00.02 - GMMFit Arguments - Aster Analytics

Teradata Aster® Analytics Foundation User GuideUpdate 2

Product
Aster Analytics
Release Number
7.00.02
Release Date
September 2017
Content Type
Programming Reference
User Guide
Publication ID
B700-1022-700K
Language
English (United States)
InputTable
Specifies the name of the table that contains the input data to be clustered.
OutputTable
Specifies the name of the output table to which the function outputs cluster information. The table must not already exist.
MaxClusterNum
[Required if ClusterNum is omitted, otherwise not allowed.] Specifies the maximum number of clusters in a Dirichlet Process model and causes the function to use the DP-GMM algorithm. This value must have the data type INTEGER. Default: 20.
ClusterNum
[Required if MaxClusterNum is omitted, otherwise not allowed.] Specifies the number of clusters in a model and causes the function to use the basic GMM algorithm. This value must have the data type INTEGER and be greater than 0. Default: 10.
CovarianceType
[Optional] Specifies the covariance matrix type, thereby determining how many parameters the function estimates for each cluster:
  • 'diagonal' (Default):

    Each covariance matrix has zeros on the nondiagonal. The function estimates D parameters for each cluster, where D is the number of dimensions in the matrix.

  • 'spherical':

    Each covariance matrix is of the form σI. The function estimates one parameter for each cluster.

  • 'tied':

    Each cluster has the same covariance matrix. The function estimates (1/2)D(D-1) parameters.

  • 'full':

    Each cluster has an arbitrary covariance matrix. The function estimates (1/2)D(D-1) parameters for each cluster.

Tolerance
[Optional] Specifies the minimum change in log-likelihood between iterations that causes the function to terminate. This value must have the data type DOUBLE PRECISION and be greater than 0. Default: 0.001.
MaxIterNum
[Optional] Specifies the maximum number of iterations for which the function runs. This value must have the data type INTEGER and be greater than 0. Default: 10.
ConcentrationParam

[Optional] Specify this argument only if you specify MaxClusterNum. Specifies the concentration parameter, α, which determines the number of clusters that the DP-GMM algorithm generates. This value must have the data type DOUBLE PRECISION and be greater than 0.

The expected number of clusters is α log N, where N is the number of points in the data set; therefore, a larger α value tends to cause the algorithm to find more clusters.

Default: 0.001.

PackOutput
[Optional] Specifies whether the function packs the output. Default: 'false'.