Factor analysis is designed to discover the underlying structure or meaning in a set of variables and facilitate their reduction to a fewer number of variables called factors or components. The first goal is facilitated by finding the factor loadings that describe the variables in a dataset in terms of a linear combination of factors. The second goal is facilitated by finding a description for the factors as linear combinations of the original variables they describe. These are sometimes called factor measurements or scores. After computing the factor loadings, computing factor scores might seem like an afterthought, but it is somewhat more involved than that. Analytics Library automates the process based on the model information stored in results tables, computing factor scores directly in the database by dynamically generating and executing SQL.
When scoring a table using a PCA factor analysis model, the scores can be calculated directly without estimation, even if an orthogonal rotation was performed. When scoring using a PCA model with an oblique rotation, a unique solution does not exist and cannot be directly solved for (a condition known as the indeterminacy of factor measurements). There are many techniques for estimating factor measurements, and the technique used by Analytics Library is known as estimation by regression. This technique involves regressing each factor on the original variables in the factor analysis model using linear regression techniques. It gives an accurate solution in the “least-squared error” sense but it typically introduces some degree of dependence or correlation in the computed factor scores.
Independence or Orthogonality of Factor Scores
As noted earlier, factor loadings for a PCA model are orthogonal using the techniques offered by Analytics Library unless an oblique rotation is performed. Factor scores, however, will not necessarily be orthogonal for principal components with oblique rotations since scores are estimated by regression. This subtle distinction can be an easy source of confusion. That is, the new variables or factor scores created by a factor analysis, expressed as a linear combination of the original variables, are not necessarily independent of each other, even if the factors themselves are. You can measure their independence, however, using the Matrix Building function to build a correlation matrix from the factor score table once it is built.