Factor analysis is designed primarily for the purpose of discovering the underlying structure or meaning in a set of variables and to facilitate their reduction to a fewer number of variables called factors or components. The first goal is facilitated by finding the factor loadings that describe the variables in a data set in terms of a linear combination of factors. The second goal is facilitated by finding a description for the factors as linear combinations of the original variables they describe. These are sometimes called factor measurements or scores. After computing the factor loadings, computing factor scores might seem like an afterthought, but it is somewhat more involved than that. Teradata Warehouse Miner does automate the process however based on the model information stored in metadata results tables, computing factor scores directly in the database by dynamically generating and executing SQL.
When scoring a table using a PCA factor analysis model, the scores can be calculated directly without estimation, even if an orthogonal rotation was performed. When scoring using a PAF or MLF model, or a PCA model with an oblique rotation, a unique solution does not exist and cannot be directly solved for (a condition known as the indeterminacy of factor measurements). There are many techniques however for estimating factor measurements, and the technique used by Teradata Warehouse Miner is known as estimation by regression. This technique involves regressing each factor on the original variables in the factor analysis model using linear regression techniques. It gives an accurate solution in the “least-squared error” sense but it typically introduces some degree of dependence or correlation in the computed factor scores.
A final word about the independence or orthogonality of factor scores is appropriate here. It was pointed out earlier that factor loadings are orthogonal using the techniques offered by Teradata Warehouse Miner unless an oblique rotation is performed. Factor scores, however, will not necessarily be orthogonal for principal axis factors and maximum likelihood factors and with oblique rotations since scores are estimated by regression. This is a subtle distinction that is an easy source of confusion. That is, the new variables or factor scores created by a Factor analysis, expressed as a linear combination of the original variables, are not necessarily independent of each other, even if the factors themselves are. The user may measure their independence however by using the Matrix and Export Matrix functions to build a correlation matrix from the factor score table once it is built.