- nRep
Number of replications. If any of the n
, pmMCAR
, or pmMAR
arguments are specified as lists, the number of replications will default to the length of the list(s), and nRep
need not be specified (can be set NULL
). When completeRep > 0
(see description below), then nRep
is the target number of replications for which no convergence issues are detected.
- model
There are three options for this argument: 1. SimSem
object created by model
, 2. lavaan
script, lavaan
parameter table, fitted lavaan
object matching the analysis model, or a list that contains all argument that users use to run lavaan
(including cfa
, sem
, lavaan
), 3. MxModel
object from the OpenMx
package, or 4. a function that takes a data set and return a list of coef
, se
, and converged
(see details below). For the SimSem
object, if the generate
argument is not specified, then the object in the model
argument will be used for both data generation and analysis. If generate
is specified, then the model
argument will be used for data analysis only.
- n
Sample size(s). In single-group models, either a single integer
, or a vector of integers to vary sample size across replications. In multigroup models, either a list
of single integers (for constant group sizes across replications) or a list
of vectors (to vary group sizes across replications).
Any non-integers will be rounded.
- generate
There are four options for this argument: 1. SimSem
object created by model
, 2. lavaan
script, lavaan
parameter table (for data generation; see simulateData
), fitted lavaan
object that estimated all nonzero population parameters, or a list that contains all argument that users use to run simulateData
, 3. MxModel
object with population parameters specified in the starting values of all matrices in the model, 4. a function that take only one sample size argument (by integer for single-group model or by a vector of integers for multiple-group model). The generate
argument cannot be specified the same time as the rawData
argument.
- rawData
There are two options for this argument: 1. a list of data frames to be used in simulations or 2. a population data. If a list of data frames is specified, the nRep
and n
arguments must not be specified. If a population data frame is specified, the nRep
and n
arguments are required.
- miss
A missing data template created using the miss
function.
- datafun
A function to be applied to each generated data set across replications.
- lavaanfun
The character of the function name used in running lavaan model ("cfa"
, "sem"
, "growth"
, "lavaan"
). This argument is required only when lavaan script or a list of arguments is specified in the model
argument.
- outfun
A function to be applied to the lavaan-class
output at each replication. Output from this function in each replication will be saved in the simulation output (SimResult
), and can be obtained using the getExtraOutput
function.
- outfundata
A function to be applied to the lavaan-class
output and the generated data at each replication. Users can get the characteristics of the generated data and also compare the characteristics with the generated output. The output from this function in each replication will be saved in the simulation output (SimResult
), and can be obtained using the getExtraOutput
function.
- pmMCAR
The percentage of data completely missing at random (0 <= pmMCAR < 1). Either a single value or a vector of values in order to vary pmMCAR across replications (with length equal to nRep or a divisor of nRep). The miss=
argument is only required when specifying more complex missing value data generation.
- pmMAR
The percentage of data missing at random (0 <= pmCAR < 1). Either a single value or a vector of values in order to vary pmCAR across replications (with length equal to nRep or a divisor of nRep). The miss=
argument is only required when specifying more complex missing value data generation.
- facDist
Factor distributions. Either a list of SimDataDist
objects or a single SimDataDist
object to give all factors the same distribution. Use when sequential
is TRUE
.
- indDist
Indicator distributions. Either a list of SimDataDist
objects or a single SimDataDist
object to give all indicators the same distribution. Use when sequential
is FALSE
.
- errorDist
An object or list of objects of type SimDataDist
indicating the distribution of errors. If a single SimDataDist
is specified, each error will be genrated with that distribution.
- sequential
If TRUE
, a sequential method is used to generate data in which factor data is generated first, and is subsequently applied to a set of equations to obtain the indicator data. If FALSE
, data is generated directly from model-implied mean and covariance of the indicators.
- saveLatentVar
If TRUE
, the generated latent variable scores and measurement error scores are also provided as the attribute of the generated data. Users can use the outfundata
to compare the latent variable scores with the estimated output. The sequential
argument must be TRUE
in order to use this option.
- modelBoot
When specified, a model-based bootstrap is used for data generation (for use with the realData
argument). See draw
for further information.
- realData
A data.frame containing real data. Generated data will follow the distribution of this data set.
- covData
A data.frame containing covariate data, which can have any distributions. This argument is required when users specify GA
or KA
matrices in the model template (SimSem
).
- maxDraw
The maximum number of attempts to draw a valid set of parameters (no negative error variance, standardized coefficients over 1).
- misfitType
Character vector indicating the fit measure used to assess the misfit of a set of parameters. Can be "f0", "rmsea", "srmr", or "all".
- misfitBounds
Vector that contains upper and lower bounds of the misfit measure. Sets of parameters drawn that are not within these bounds are rejected.
- averageNumMisspec
If TRUE
, the provided fit will be divided by the number of misspecified parameters.
- optMisfit
Character vector of either "min" or "max" indicating either maximum or minimum optimized misfit. If not null, the set of parameters out of the number of draws in "optDraws" that has either the maximum or minimum misfit of the given misfit type will be returned.
- optDraws
Number of parameter sets to draw if optMisfit is not null. The set of parameters with the maximum or minimum misfit will be returned.
- createOrder
The order of 1) applying equality/inequality constraints, 2) applying misspecification, and 3) fill unspecified parameters (e.g., residual variances when total variances are specified). The specification of this argument is a vector of different orders of 1 (constraint), 2 (misspecification), and 3 (filling parameters). For example, c(1, 2, 3)
is to apply constraints first, then add the misspecification, and finally fill all parameters. See the example of how to use it in the draw
function.
- aux
The names of auxiliary variables saved in a vector.
- group
The name of the group variable. This argument is used when lavaan
script or MxModel
is used in the model
only. When generating data from a multigroup population model, the grouping variable in each generated data set will be named "group", so when additionally using a multigroup analysis model, users must specify this argument as group="group"
.
- mxFit
A logical whether to find an extensive list of fit measures (which will be slower). This argument is applicable when MxModel
is used in the model
argument only.
- mxMixture
A logical whether to the analysis model is a mixture model. This argument is applicable when MxModel
is used in the model
argument only.
- citype
Type of confidence interval. For the current version, this argument will be forwarded to the "boot.ci.type"
argument in the parameterEstimates
function from the lavaan
package. This argument is not active when the OpenMx
package is used.
- cilevel
Confidence level. For the current version, this argument will be forwarded to the "level"
argument in the parameterEstimates
function from the lavaan
package. This argument is not active when the OpenMx
package is used.
- seed
Random number seed. Note that the seed number is always fixed in the simsem
so that users can always replicate the same simulation or can be confidence that the same data set are generated. Reproducibility across multiple cores or clusters is ensured using R'Lecuyer package.
- silent
If TRUE
, suppress warnings.
- multicore
Users may put TRUE
or FALSE
. If TRUE
, multiple processors within a computer will be utilized. The default value is FALSE
. Users may permanently change the default value by assigning the following line: options('simsem.multicore' = TRUE)
- numProc
Number of processors for using multiple processors. If it is NULL
, the package will find the maximum number of processors.
- paramOnly
If TRUE
, only the parameters from each replication will be returned.
- dataOnly
If TRUE
, only the raw data generated from each replication will be returned.
- smartStart
Defaults to FALSE. If TRUE, population parameter values that are real numbers will be used as starting values. When tested in small models, the time elapsed when using population values as starting values was greater than the time reduced during analysis, and convergence rates were not affected.
- previousSim
A result object that users wish to add the results of the current simulation in
- completeRep
Nonnegative integer
indicating how many samples are allowed to be drawn in order to obtain at least nRep
results without convergence issues (including Heywood cases or no standard errors). Ignored unless completeRep > nRep
. Can also be logical
, where FALSE
(or 0
, the default) indicates only nRep
samples may be drawn. If TRUE
, up to 10% additional samples may be drawn by default (i.e., completeRep = as.integer(nRep*1.1)
).
- stopOnError
If TRUE
, stop running the simulation when the error occurs during the data analysis on any replications.
- ...
Additional arguments to be passed to lavaan
. See also lavOptions