fa.parallel(x, n.obs = NULL,fm="minres", fa="both", main = "Parallel Analysis Scree Plots",n.iter=20,error.bars=FALSE,SMC=FALSE,ylabel=NULL,show.legend=TRUE)
fa.parallel.poly(x ,n.iter=10,SMC=TRUE, fm = "minres")
plot.poly.parallel(x,show.legend=TRUE,...)
fa
for details.VSS
) and Velicer's MAP
procedure (included in VSS
). fa.parallel plots the eigen values for a principal components and the factor solution (minres by default) and does the same for random matrices of the same size as the original data matrix. For raw data, the random matrices are 1) a matrix of univariate normal data and 2) random samples (randomized across rows) of the original data.fa.parallel.poly
will do parallel analysis for polychoric and tetrachoric factors. If the data are dichotomous, fa.parallel.poly
will find tetrachoric correlations for the real and simulated data, otherwise, if the number of categories is less than 10, it will find polychoric correlations. Note that fa.parallel.poly is much slower than fa.parallel because of the complexity of calculating the tetrachoric/polychoric correlations.
The means of (ntrials) random solutions are shown. Error bars are usually very small and are suppressed by default but can be shown if requested.
Alternative ways to estimate the number of factors problem are discussed in the Very Simple Structure (Revelle and Rocklin, 1979) documentation (VSS
) and include Wayne Velicer's MAP
algorithm (Veicer, 1976).
Parallel analysis for factors is actually harder than it seems, for the question is what are the appropriate communalities to use. If communalities are estimated by the Squared Multiple Correlation (SMC) smc
, then the eigen values of the original data will reflect major as well as minor factors (see sim.minor
to simulate such data). Random data will not, of course, have any structure and thus the number of factors will tend to be biased upwards by the presence of the minor factors.
By default, fa.parallel estimates the communalities based upon a one factor minres solution. Although this will underestimate the communalities, it does seem to lead to better solutions on simulated or real (e.g., the bfi
or Harman74) data sets.
For comparability with other algorithms (e.g, the paran function in the paran package), setting smc=TRUE will use smcs as estimates of communalities. This will tend towards identifying more factors than the default option.
Printing the results will show the eigen values of the original data that are greater than simulated values.
Horn, John (1965) A rationale and test for the number of factors in factor analysis. Psychometrika, 30, 179-185.
Humphreys, Lloyd G. and Montanelli, Richard G. (1975), An investigation of the parallel analysis criterion for determining the number of common factors. Multivariate Behavioral Research, 10, 193-205.
Revelle, William and Rocklin, Tom (1979) Very simple structure - alternative procedure for estimating the optimal number of interpretable factors. Multivariate Behavioral Research, 14(4):403-414.
Velicer, Wayne. (1976) Determining the number of components from the matrix of partial correlations. Psychometrika, 41(3):321-327, 1976.
fa
, VSS
, VSS.plot
, VSS.parallel
, sim.minor
test.data <- Harman74.cor$cov
fa.parallel(test.data,n.obs=145)
set.seed(123)
minor <- sim.minor(24,4,400) #4 large and 12 minor factors
fa.parallel(minor$observed) #shows 4 factors -- compare with
fa.parallel(minor$observed,SMC=TRUE) #which shows 8 factors
Run the code above in your browser using DataLab