A function to simulate factor loadings matrices and Monte Carlo data sets for common factor models, bifactor models, and IRT models.
simFA(
Model = list(),
Loadings = list(),
CrossLoadings = list(),
Phi = list(),
ModelError = list(),
Bifactor = list(),
MonteCarlo = list(),
FactorScores = list(),
Missing = list(),
Control = list(),
Seed = NULL
)
loadings
A common factor or bifactor
loadings matrix.
Phi
A factor correlation matrix.
urloadings
The unrotated loadings matrix.
h2
A vector of item communalities.
h2PopME
A vector item communalities that
may include model approximation error.
Rpop
The model-implied population correlation
matrix.
RpopME
The model-implied population
correlation matrix with model error.
W
The factor loadings for the minor factors
(when ModelError = TRUE
). Default = NULL.
Xm
That part of the observed scores that
is due to the minor common factors.
SFSvars
Variances of the Specific Factors
in the metric of the observed scores.
ModelErrorFitStats
A list of model fit
indices (for the underlying equations, see: Bentler,
1990; Hu & Bentler, 1999; Marsh, Hau, & Grayson,
2005; Steiger, 2016):
SRMR_theta
Standardized Root Mean
Square Residual based on the model that is
implied by the error free major factors
only (underlying Rpop),
SRMR_thetahat
Standardized Root
Mean Square Residual based on an exploratory
factor analysis of the population
correlation matrix, RpopME,
CRMR_theta
Correlation Root Mean
Square Residual based on the model that is
implied by the error free major factors
only (underlying Rpop),
CRMR_thetahat
Correlation Root Mean
Square Residual based on an exploratory factor
analysis of the population correlation matrix,
RpopME,
RMSEA_theta
Root Mean Square Error
of Approximation (Steiger, 2016) based on the
model that is implied by the error free major
factors only (underlying Rpop),
RMSEA_thetahat
Root Mean Square
Error of Approximation (Steiger, 2016) based
on an exploratory factor analysis of the
population correlation matrix, RpopME,
CFI_theta
Comparative Fit Index
(Bentler, 1990) based on the model that is
implied by the error free major factors
only (underlying Rpop),
CFI_thetahat
Comparative Fit Index
(Bentler, 1990) based on an exploratory
factor analysis of the population
correlation matrix, RpopME.
Fm
MLE fit function for population
target model.
Fb
MLE fit function for population
baseline model.
DFm
Degrees of freedom for
population target model.
CovMatrices
A list containing:
CovMajor
The model implied
covariances from the major factors.
CovMinor
The model implied
covariances from the minor factors.
CovUnique
The model implied
variances from the uniqueness factors.
Bifactor
A list containing:
loadingsHier
Factor loadings of the
1st order solution of a hierarchical
bifactor model.
PhiHier
Factor correlations of the
1st order solution of a hierarchical bifactor
model.
Scores
A list containing:
FactorScores
Factor scores for the
common and uniqueness factors.
FacInd
Factor indeterminacy indices
for the error free population model.
FacIndME
Factor score indeterminacy
indices for the population model with model
error.
ObservedScores
A matrix of model
implied ObservedScores
. If
Thresholds
were supplied under
Keyword FactorScores
,
ObservedScores
will be transformed
into Likert scores.
Monte
A list containing output from the
Monte Carlo simulations if generated.
IRT
Factor loadings expressed in the normal
ogive IRT metric. If Thresholds
were given
then IRT difficulty values will also be returned.
Seed
The initial seed for the random
number generator.
call
A copy of the function call.
cn
A list of all active and nonactive
function arguments.
(list)
NFac
(scalar) Number of common or group
factors; defaults to NFac = 3
.
NItemPerFac
(scalar) All factors have the same number of primary loadings.
(vector) A vector of length NFac
specifying the number of primary loadings for
each factor; defaults to
NItemPerFac = 3
.
Model
(character) "orthogonal"
or
"oblique"
; defaults to Model = "orthogonal"
.
(list)
FacPattern
(NULL
or matrix).
FacPattern = M
where M
is
a user-defined factor pattern matrix.
FacPattern = NULL
; simFA
will generate a factor pattern based on
the arguments specified under other keywords
(e.g., Model
, CrossLoadings
, etc.);
defaults to FacPattern = NULL
.
FacLoadDist
(character) Specifies the
sampling distribution for the common factor loadings.
Possible values are "runif"
, "rnorm"
,
"sequential"
, and "fixed"
; defaults
to FacLoadDist = "runif"
.
FacLoadRange
(vector of length NFac
,
2, or 1); defaults to FacLoadRange = c(.3, .7)
.
If FacLoadDist = "runif"
the vector
defines the bounds of the uniform distribution;
If FacLoadDist = "rnorm"
the vector
defines the mean and standard deviation of
the normal distribution from which loadings
are sampled.
If FacLoadDist = "sequential"
the
vector specifies the lower and upper bound
of the loadings sequence.
If FacLoadDist = "fixed"
and
FacLoadRange
is a vector of length 1
then all common loadings will equal the constant
specified in FacLoadRange
. If
FacLoadDist = "fixed"
and
FacLoadRange
is a vector of length
NFac
then each factor will have fixed
loadings as specified by the associated
element in FacLoadRange
.
h2
(vector) An optional vector of communalities
used to constrain the population communalities to
user-defined values; defaults to h2 = NULL
.
(list)
ProbCrossLoad
(scalar) A value in the (0,1)
interval that determines the probability that a cross
loading will be present in elements of the loadings
matrix that do not have salient (primary) factor loadings.
If set to ProbCrossLoad = 1
, a single cross
loading will be added to each factor; defaults to
ProbCrossLoad = 0
.
CrossLoadRange
(vector of length 2) Controls
size of the cross loadings; defaults to
CrossLoadRange = c(.20, .25)
.
CrossLoadPositions
(matrix) Specifies the
row and column positions of (optional) cross loadings;
defaults to CrossLoadPositions = NULL
.
CrossLoadValues
(vector) If
CrossLoadPositions
is specified then
CrossLoadValues
is a vector of user-supplied
cross-loadings; defaults to CrossLoadValues = NULL
.
CrudFactor
(scalar) Controls the size of
tertiary factor loadings. If CrudFactor != 0
then elements of the loadings matrix with neither
primary nor secondary (i.e., cross) loadings will
be sampled from a \[-(CrudFactor), (CrudFactor)\]
uniform distribution; defaults to CrudFactor = 0
.
(list)
MaxAbsPhi
(scalar) Upper (absolute) bound
on factor correlations; defaults to
MaxAbsPhi = .5
.
EigenValPower
(scalar) Controls the skewness
of the eigenvalues of Phi. Larger values of
EigenValPower
result in a Phi spectrum that
is more right-skewed (and thus closer to a
unidimensional model); defaults to
EigenValPower = 2
.
PhiType
(character); defaults to
PhiType = "free"
.
If PhiType = "free"
factor correlations
will be randomly generated under the constraints
of MaxAbsPhi
and EigenValPower
.
If PhiType = "fixed"
all factor
correlations will equal the value specified
in MaxAbsPhi
. A fatal error will be
produced if Phi
is not positive
semidefinite.
If PhiType = "user"
the factor
correlations are defined by the matrix
specified in UserPhi
(see below).
UserPhi
(matrix) A positive semidefinite
(PSD) matrix of user-defined factor correlations;
defaults to UserPhi = NULL
.
(list)
ModelError
(logical) If ModelError = TRUE
model error will be introduced into the factor
pattern via the method described by Tucker, Koopman,
and Linn (TKL, 1969); defaults to
ModelError = FALSE
.
W
(matrix) An optional user-supplied factor
loading matrix for the NMinorFac
minor common
factors; defaults to W = NULL
.
NMinorFac
(scalar) Number of minor factors
in the TKL model; defaults to NMinorFac = 150
.
ModelErrorType
(character) If
ModelErrorType = "U"
then ModelErrorVar
is the proportion of uniqueness variance that is due
to model error. If ModelErrorType = "V"
then
ModelErrorVar
is the proportion of total
variance that is due to model error; defaults to
ModelErrorType = "U"
.
ModelErrorVar
(scalar \[0,1\]) The proportion
of uniqueness (U) or total (V) variance that is due
to model error; defaults to
ModelErrorVar = .10
.
epsTKL
(scalar \[0,1\]) Controls the size
of the factor loadings in successive minor factors;
defaults to epsTKL = .20
.
Wattempts
(scalar > 0) Maximum number of
tries when attempting to generate a suitable W
matrix. Default = 10000.
WmaxLoading
(scalar > 0) Threshold value for
NWmaxLoading
. Default WmaxLoading = .30
.
NWmaxLoading
(scalar >= 0) Maximum number
of absolute loadings >= WmaxLoading
in any
column of W (matrix of model approximation error
factor loadings). Default NWmaxLoading = 2
.
Under the defaults, no column of W will have 3 or
more loadings > |.30|.
PrintW
(Boolean) If PrintW = TRUE
then simFA will print the attempt history when
searching for a suitable W matrix given the
constraints defined in WmaxLoading
and
NWmaxLoading
. Default PrintW = FALSE
.
RSpecific
(matrix) Optional correlation
matrix for specific factors;
defaults to RSpecific = NULL
.
(list)
Bifactor (logical) If Bifactor = TRUE
parameters for the bifactor model will be generated;
defaults to Bifactor = FALSE
.
Hierarchical (logical) If Hierarchical = TRUE
then a hierarchical Schmid Leiman (1957) bifactor
model will be generated;
defaults to Hierarchical = FALSE
.
F1FactorDist
(character) Specifies the
sampling distribution for the general factor loadings.
Possible values are "runif"
, "rnorm"
,
"sequential"
, and "fixed"
; defaults
to F1FactorDist = "sequential"
.
F1FactorRange
(vector of length 1 or 2)
Controls the sizes of the general factor loadings in
non-hierarchical bifactor models; defaults to
F1FactorRange = c(.4, .7)
.
If F1FactorDist = "runif"
, the vector
of length 2 defines the bounds of the uniform
distribution, c(lower, upper);
If F1FactorDist = "rnorm"
, the
vector defines the mean and standard
deviation of the normal distribution from
which loadings are sampled, c(MN, SD).
If F1FactorDist = "sequential"
,
the vector specifies the lower and upper
bound of the loadings sequence, c(lower, upper).
(list)
NSamples
(integer) Defines number of Monte
Carlo Samples; defaults to NSamples = 0
.
SampleSize
(integer) Sample size for each
Monte Carlo sample; defaults to SampleSize = 250
.
Raw
(logical) If Raw = TRUE
, simulated
data sets will contain raw data. If Raw = FALSE
,
simulated data sets will contain correlation matrices;
defaults to Raw = FALSE
.
Thresholds
(list) List elements contain
thresholds for each item. Thresholds are required
when generating Likert variables.
(list)
FS
(logical) If FS = TRUE
(true)
factor scores will be simulated; defaults to
FS = FALSE
.
CFSeed
(integer) Optional starting seed for
the common factor scores; defaults to
CFSeed = NULL
in which case a random seed is
used.
MCFSeed
(integer) Optional starting seed
for the minor common factor scores; defaults to
MCFSeed = NULL
.
SFSeed
(integer) Optional starting seed
for the specific factor scores; defaults to
SFSeed = NULL
in which case a random seed is
used.
EFSeed
(integer) Optional starting seed
for the error factor scores; defaults to
EFSeed = NULL
in which case a random seed
is used. Note that CFSeed
, MCFSeed
,
SFSeed
, and EFSeed
must be different
numbers (a fatal error is produced when two or more
seeds are specified as equal).
VarRel
(vector) A vector of manifest variable
reliabilities. The specific factor variance for
variable i will equal \(VarRel[i] - h^2[i]\)
(the manifest variable reliability minus its
commonality). By default, \(VarRel = h^2\)
(resulting in uniformly zero specific factor
variances).
Population
(logical) If Population =
TRUE
, factor scores will fit the correlational
constraints of the factor model exactly (e.g., the
common factors will be orthogonal to the unique
factors); defaults to Population = FALSE
.
NFacScores
(scalar) Sample size for the
factor scores; defaults to NFacScores = 250
.
Thresholds
(list) A list of quantiles used
to polychotomize the observed data that will be
generated from the factor scores.
(list)
Missing (logical) If Missing = TRUE
all
data sets will contain missing values; defaults to
Missing = FALSE
.
Mechanism
(character) Specifies the missing
data mechanism. Currently, the program only supports
missing completely at random (MCAR):
Missing = "MCAR"
.
MSProb
(scalar or vector of length
NVar
) Specifies the probability of
missingness for each variable; defaults to
MSprob = 0
.
(list)
IRT
(logical) If IRT = TRUE
then
user-supplied thresholds will be interpreted as
item intercepts; defaults to IRT = FALSE
.
Dparam
(scalar). If Dparam = 1
then item
intercepts should be scaled in the logistic metric.
If Dparam = 1.702
then intercepts should be
scaled in the probit metric.
Maxh2
(scalar) Rows of the loadings matrix
will be rescaled to have a maximum communality of
Maxh2
; defaults to Maxh2 = .98
.
Reflect
(logical) If Reflect =
TRUE
loadings on the common factors will be
randomly reflected; defaults to
Reflect = FALSE
.
(integer) Starting seed for the random number
generator; defaults to Seed = NULL
. When no seed
is specified by the user, the program will generate a random
seed.
Niels G. Waller with contributions by Hoang V. Nguyen
For a complete description of simFA
's
capabilities, users are encouraged to consult the simFABook
at http://users.cla.umn.edu/~nwaller/simFA/simFABook.pdf.
simFA
is a program for exploring factor analysis
models via simulation studies.
After calling simFA
all relevant output can be saved
for further processing by calling one or more of the following
object names.
Bentler, P. M. (1990). Comparative fit indexes in structural models. Psychological Bulletin, 107(2), 238--246.
Hu, L.-T. & Bentler, P. M. (1999). Cutoff criteria for fit indexes in covariance structure analysis: Conventional criteria versus new alternatives. Structural Equation Modeling: A Multidisciplinary Journal, 6(1), 1--55.
Marsh, H. W., Hau, K.-T., & Grayson, D. (2005). Goodness of fit in structural equation models. In A. Maydeu-Olivares & J. J. McArdle (Eds.), Multivariate applications book series. Contemporary psychometrics: A festschrift for Roderick P. McDonald (p. 275--340). Lawrence Erlbaum Associates Publishers.
Schmid, J. and Leiman, J. M. (1957). The development of hierarchical factor solutions. Psychometrika, 22(1), 53--61.
Steiger, J. H. (2016). Notes on the Steiger–Lind (1980) handout. Structural Equation Modeling: A Multidisciplinary Journal, 23:6, 777-781.
Tucker, L. R., Koopman, R. F., and Linn, R. L. (1969). Evaluation of factor analytic research procedures by means of simulated correlation matrices. Psychometrika, 34(4), 421--459.
## Not run:
# Ex 1. Three Factor Simple Structure Model with Cross loadings and
# Ideal Non salient Loadings
out <- simFA(Seed = 1)
print( round( out$loadings, 2 ) )
# Ex 2. Non Hierarchical bifactor model 3 group factors
# with constant loadings on the general factor
out <- simFA(Bifactor = list(Bifactor = TRUE,
Hierarchical = FALSE,
F1FactorRange = c(.4, .4),
F1FactorDist = "runif"),
Seed = 1)
print( round( out$loadings, 2 ) )
# Ex 3. Model Fit Statistics for Population Data with
# Model Approximation Error. Three Factor model.
out <- simFA(Loadings = list(FacLoadDist = "fixed",
FacLoadRange = .5),
ModelError = list(ModelError = TRUE,
NMinorFac = 150,
ModelErrorType = "V",
ModelErrorVar = .1,
Wattempts = 10000,
epsTKL = .2),
Seed = 1)
print( out$loadings )
print( out$ModelErrorFitStats[seq(2,8,2)] )
## End(**Not run**)
Run the code above in your browser using DataLab