5.4.5 - Factor Rotations - Teradata Warehouse Miner

Teradata Warehouse Miner User Guide - Volume 3Analytic Functions

Product
Teradata Warehouse Miner
Release Number
5.4.5
Published
February 2018
Language
English (United States)
Last Update
2018-05-04
dita:mapPath
yuy1504291362546.ditamap
dita:ditavalPath
ft:empty

Teradata Warehouse Miner offers a number of techniques for rotating factors in order to find the elusive quality of simple structure described earlier. These may optionally be used in combination with any of the factor techniques offered in the product. When a rotation is performed, both the rotated matrix and the rotation matrix is reported, as well as the reproduced correlation or covariance matrix after rotation. As before with the factor solutions themselves, the user may optionally request that the signs of a factor in the rotated factor or components matrix be inverted if there are more minus signs than positive ones. This is purely cosmetic and does not affect the solution in a substantive way.

Orthogonal rotations

First consider orthogonal rotations, that is, rotations of a factor matrix A that result in a rotated factor matrix B by way of an orthogonal transformation matrix T (i.e., B = AT). Remember that the nice thing about orthogonal rotations on a factor matrix is that the resulting factors scores are uncorrelated, a desirable property when the factors are going to be used in subsequent regression, cluster or other type of analysis. But how is simple structure obtained?

As described earlier, the idea behind simple structure is to express each component or factor in terms of fewer variables that are highly correlated with the factor, with the remaining variables not so correlated with the factor. The two most famous mathematical criteria for simple factor structure are the quartimax and varimax criteria. Simply put, the varimax criterion seeks to simplify the structure of columns or factors in the factor loading matrix, whereas the quartimax criterion seeks to simplify the structure of the rows or variables in the factor loading matrix. Less simply put, the varimax criterion seeks to maximize the variance of the squared loadings across the variables for all factors. The quartimax criterion seeks to maximize the variance of the squared loadings across the factors for all variables. The solution to either optimization problem is mathematically quite involved, though in principle it is based on fundamental techniques of linear algebra, differential calculus, and the use of the popular Newton-Raphson iterative technique for finding the roots of equations.

Regardless of the criterion used, rotations are performed on normalized loadings, that is prior to rotating, the rows of the factor loading matrix are set to unit length by dividing each element by the square root of the communality for that variable. The rows are unnormalized back to the original length after the rotation is performed. This has been found to improve results, particularly for the varimax method.

Fortunately both the quartimax and varimax criteria can be expressed in terms of the same equation containing a constant value that is 0 for quartimax and 1 for varimax. The orthomax criterion is then obtained simply by setting this constant, call it gamma, to any desired value, equamax corresponds to setting this constant to half the number of factors, and parsimax is given by setting the value of gamma to v(f-1) / (v+f+2) where v is the number of variables and f is the number of factors.

Oblique rotations

As mentioned earlier, oblique rotations relax the requirement for factor independence that exists with orthogonal rotations, while more aggressively seeking better data alignment. Teradata Warehouse Miner uses a technique known as the indirect oblimin method. As with orthogonal rotations, there is a common equation for the oblique simple structure criterion that contains a constant that can be set for various effects. A value of 0 for this constant, call it gamma, yields the quartimin solution, which is the most oblique solution of those offered. A value of 1 yields the covarimin solution, the least oblique case. And a value of 0.5 yields the biquartimin solution, a compromise between the two. A solution known as orthomin can be achieved by setting the value of gamma to any desired positive value.

One of the distinctions of a factor solution that incorporates an oblique rotation is that the factor loadings must be thought of in terms of two different matrices, the factor pattern P matrix and the factor structure matrix S. These are related by the equation S = PQ where Q is the matrix of correlations between factors. Obviously if the factors are not correlated, as in an unrotated solution or after an orthogonal rotation, then Q is the identity matrix and the structure and pattern matrix are the same. The result of an oblique rotation must include both the pattern matrix that describes the common factors and the structure matrix of correlations between the factors and original variables.

As with orthogonal rotations, oblique rotations are performed on normalized loadings that are restored to their original size after rotation. A unique characteristic of the indirect oblimin method of rotation is that it is performed on a reference structure based on the normals of the original factor space. There is no inherent value in this, but is in fact just a side effect of the technique. It means however that an oblique rotation results in a reference factor pattern, structure and rotation matrix that is then converted back into the original factor space as the final primary factor pattern, structure and rotation matrix.