GMMFit Arguments - Aster Analytics

Teradata Aster® Analytics Foundation User GuideUpdate 2

Product

Aster Analytics

Release Number

7.00.02

Published

September 2017

Language

English (United States)

Last Update

2018-04-17

dita:mapPath

uce1497542673292.ditamap

dita:ditavalPath

AA-notempfilter_pdf_output.ditaval

dita:id

B700-1022

lifecycle

Product Category

Software

InputTable

Specifies the name of the table that contains the input data to be clustered.

OutputTable

Specifies the name of the output table to which the function outputs cluster information. The table must not already exist.

MaxClusterNum

[Required if ClusterNum is omitted, otherwise not allowed.] Specifies the maximum number of clusters in a Dirichlet Process model and causes the function to use the DP-GMM algorithm. This value must have the data type INTEGER. Default: 20.

ClusterNum

[Required if MaxClusterNum is omitted, otherwise not allowed.] Specifies the number of clusters in a model and causes the function to use the basic GMM algorithm. This value must have the data type INTEGER and be greater than 0. Default: 10.

CovarianceType

[Optional] Specifies the covariance matrix type, thereby determining how many parameters the function estimates for each cluster:

'diagonal' (Default):
Each covariance matrix has zeros on the nondiagonal. The function estimates D parameters for each cluster, where D is the number of dimensions in the matrix.
'spherical':
Each covariance matrix is of the form σI. The function estimates one parameter for each cluster.
'tied':
Each cluster has the same covariance matrix. The function estimates (1/2)D(D-1) parameters.
'full':
Each cluster has an arbitrary covariance matrix. The function estimates (1/2)D(D-1) parameters for each cluster.

Tolerance

[Optional] Specifies the minimum change in log-likelihood between iterations that causes the function to terminate. This value must have the data type DOUBLE PRECISION and be greater than 0. Default: 0.001.

MaxIterNum

[Optional] Specifies the maximum number of iterations for which the function runs. This value must have the data type INTEGER and be greater than 0. Default: 10.

ConcentrationParam

[Optional] Specify this argument only if you specify MaxClusterNum. Specifies the concentration parameter, α, which determines the number of clusters that the DP-GMM algorithm generates. This value must have the data type DOUBLE PRECISION and be greater than 0.

The expected number of clusters is α log N, where N is the number of points in the data set; therefore, a larger α value tends to cause the algorithm to find more clusters.

Default: 0.001.

PackOutput

[Optional] Specifies whether the function packs the output. Default: 'false'.