Maximum Likelihood Factors - Teradata Warehouse Miner

Teradata Warehouse Miner User Guide - Volume 3Analytic Functions

Product
Release Number
5.4.5
Published
February 2018
Language
English (United States)
Last Update
2018-05-04
dita:mapPath
yuy1504291362546.ditamap
dita:ditavalPath
ft:empty
dita:id
B035-2302
Product Category
Software

As mentioned earlier, the common factor model attempts to find both common and unique factors explaining the covariance or correlations amongst a set of variables. That is, an attempt is made to find a factor pattern C and a uniqueness matrix R such that a covariance or correlation matrix S can be modeled as S = CCT + R. To do this, it is necessary to utilize the principle of maximum likelihood based on the assumption that the data comes from a multivariate normal distribution. Due to dealing with the distribution function of the elements of a covariance matrix it is necessary to use the Wishart distribution in order to derive the likelihood equation. The optimization technique used then to maximize the likelihood of a solution for C and R is the Expectation Maximization or EM technique. This technique, often used in the replacement of missing data, is the same basic technique used in Teradata Warehouse Minerâ€™s cluster analysis algorithm. Some key points regarding this technique are described below.

Beginning with a correlation or covariance matrix S as with our other factor techniques, a principal components solution is first derived as an initial estimate for the factor pattern matrix C, with the initial estimate for the uniqueness matrix R taken simply as S - CCT. Then the maximum likelihood solution is iteratively found, yielding a best estimate of C and R. In order then to assess the effectiveness of the model, the correlation or covariance matrix S is compared to the reproduced matrix CCT - R.

It should be pointed out that when using the maximum likelihood solution the user must first specify the number of common factors f to produce in the model. The software will not automatically determine what this value should be or determine it based on a threshold value. Also, an internal adjustment is made to the final factor pattern matrix C to make the factors orthogonal, something that is automatically true of the other factor solutions. Finally, the user may optionally request that the signs of a factor in the matrix C be inverted if there are more minus signs than positive ones. This is purely cosmetic and does not affect the solution in a substantive way. However, if signs are reversed, this must be kept in mind when attempting to interpret or assign meaning to the factors.