This function can be used to create data from a set of parameters created from draw
, called a codeparamSet. This function is used internally to create data, and is available publicly for accessibility and debugging.
createData(paramSet, n, indDist=NULL, sequential=FALSE, facDist=NULL,
errorDist=NULL, saveLatentVar = FALSE, indLab=NULL, facLab = NULL,
modelBoot=FALSE, realData=NULL, covData=NULL, empirical = FALSE)
Set of drawn parameters from draw
.
Integer of desired sample size.
If TRUE
, use a sequential method to create data such that the data from factor are generated first and apply to a set of equations to obtain the data of indicators. If FALSE
, create data directly from model-implied mean and covariance of indicators.
An object or list of objects of type SimDataDist
indicating the distribution of errors. If a single SimDataDist
is specified, each error will be genrated with that distribution.
If TRUE
, the total latent variable scores, residual latent variable scores, and measurement error scores are also provided as the "latentVar"
attribute of the generated data by the following line: attr(generatedData, "latentVar")
. The sequential
argument must be TRUE
in order to use this option.
A vector of indicator labels. When not specified, the variable names are x1, x2, ... xN
.
A vector of factor labels. When not specified, the variable names are f1, f2, ... fN
.
When specified, a model-based bootstrap is used for data generation. See details for further information. This argument requires real data to be passed to readData
.
A data.frame containing real data. The data generated will follow the distribution of this data set.
Logical. If TRUE
, the specified parameters are treated as sample statistics and data are created to get the specified sample statistics. This argument is applicable when multivariate normal distribution is specified only.
A data.frame containing simulated data from the data generation template. A variable "group" is appended indicating group membership.
This function will use the modified mvrnorm
function (from the MASS package) by Paul E. Johnson to create data from model implied covariance matrix if the data distribution object ('>SimDataDist
) is not specified. The modified function is just a small modification from the original mvrnorm
function such that the data generated with the sample sizes of n and n + k (where k > 0) will be replicable in the first n rows.
It the data distribution object is specified, either the copula model or the Vale and Maurelli's method is used. For the copula approach, if the copula
argument is not specified in the data distribution object, the naive Gaussian copula is used. The correlation matrix is direct applied to the multivariate Gaussian copula. The correlation matrix will be equivalent to the Spearman's correlation (rank correlation) of the resulting data. If the copula
argument is specified, such as ellipCopula
, normalCopula
, or archmCopula
, the data-transformation method from Mair, Satorra, and Bentler (2012) is used. In brief, the data (\(X\)) are created from the multivariate copula. The covariance from the generated data is used as the starting point (\(S\)). Then, the target data (\(Y\)) with the target covariance as model-implied covariance matrix (\(\Sigma_0\)) can be created:
$$ Y = XS^{-1/2}\Sigma^{1/2}_0. $$
See bindDist
for further details. For the Vale and Maurelli's (1983) method, the code is brought from the lavaan
package.
For the model-based bootstrap, the transformation proposed by Yung & Bentler (1996) is used. This procedure is the expansion from the Bollen and Stine (1992) bootstrap including a mean structure. The model-implied mean vector and covariance matrix with trivial misspecification will be used in the model-based bootstrap if misspec
is specified. See page 133 of Bollen and Stine (1992) for a reference.
Internally, parameters are first drawn, and data is then created from these parameters. Both of these steps are available via the draw
and createData
functions respectively.
Bollen, K. A., & Stine, R. A. (1992). Bootstrapping goodness-of-fit measures in structural equation models. Sociological Methods and Research, 21, 205-229.
Mair, P., Satorra, A., & Bentler, P. M. (2012). Generating nonnormal multivariate data using copulas: Applications to SEM. Multivariate Behavioral Research, 47, 547-565.
Vale, C. D. & Maurelli, V. A. (1983) Simulating multivariate nonormal distributions. Psychometrika, 48, 465-471.
Yung, Y.-F., & Bentler, P. M. (1996). Bootstrapping techniques in analysis of mean and covariance structures. In G. A. Marcoulides & R. E. Schumacker (Eds.), Advanced structural equation modeling: Issues and techniques (pp. 195-226). Mahwah, NJ: Erlbaum.
# NOT RUN {
loading <- matrix(0, 6, 2)
loading[1:3, 1] <- NA
loading[4:6, 2] <- NA
LY <- bind(loading, 0.7)
latent.cor <- matrix(NA, 2, 2)
diag(latent.cor) <- 1
RPS <- binds(latent.cor, 0.5)
RTE <- binds(diag(6))
VY <- bind(rep(NA,6),2)
CFA.Model <- model(LY = LY, RPS = RPS, RTE = RTE, modelType = "CFA")
# Draw a parameter set for data generation.
param <- draw(CFA.Model)
# Generate data from the first group in the paramList.
dat <- createData(param[[1]], n = 200)
# }
Run the code above in your browser using DataLab