permuteMeasEq: Permutation Randomization Tests of Measurement Equivalence and Differential Item Functioning (DIF)

Description

The function permuteMeasEq provides tests of hypotheses involving measurement equivalence, in one of two frameworks: multigroup CFA or MIMIC models.

Usage

permuteMeasEq(nPermute, modelType = c("mgcfa", "mimic"), con, uncon = NULL,
  null = NULL, param = NULL, freeParam = NULL, covariates = NULL,
  AFIs = NULL, moreAFIs = NULL, maxSparse = 10, maxNonconv = 10,
  showProgress = TRUE, warn = -1, datafun, extra,
  parallelType = c("none", "multicore", "snow"), ncpus = NULL, cl = NULL,
  iseed = 12345)

Value

The permuteMeasEq object representing the results of testing measurement equivalence (the multiparameter omnibus test) and DIF (modification indices), as well as diagnostics and any extra output.

Arguments

nPermute: An integer indicating the number of random permutations used to form empirical distributions under the null hypothesis.
modelType: A character string indicating type of model employed: multiple-group CFA ("mgcfa") or MIMIC ("mimic").
con: The constrained lavaan object, in which the parameters specified in param are constrained to equality across all groups when modelType = "mgcfa", or which regression paths are fixed to zero when modelType = "mimic". In the case of testing configural invariance when modelType = "mgcfa", con is the configural model (implicitly, the unconstrained model is the saturated model, so use the defaults uncon = NULL and param = NULL). When modelType = "mimic", con is the MIMIC model in which the covariate predicts the latent construct(s) but no indicators (unless they have already been identified as DIF items).
uncon: Optional. The unconstrained lavaan object, in which the parameters specified in param are freely estimated in all groups. When modelType = "mgcfa", only in the case of testing configural invariance should uncon = NULL. When modelType = "mimic", any non-NULL uncon is silently set to NULL.
null: Optional. A lavaan object, in which an alternative null model is fit (besides the default independence model specified by lavaan) for the calculation of incremental fit indices. See Widamin & Thompson (2003) for details. If NULL, lavaan's default independence model is used.
param: An optional character vector or list of character vectors indicating which parameters the user would test for DIF following a rejection of the omnibus null hypothesis tested using (more)AFIs. Note that param does not guarantee certain parameters are constrained in con; that is for the user to specify when fitting the model. If users have any "anchor items" that they would never intend to free across groups (or levels of a covariate), these should be excluded from param; exceptions to a type of parameter can be specified in freeParam. When modelType = "mgcfa", param indicates which parameters of interest are constrained across groups in con and are unconstrained in uncon. Parameter names must match those returned by names(coef(con)), but omitting any group-specific suffixes (e.g., "f1~1" rather than "f1~1.g2") or user-specified labels (that is, the parameter names must follow the rules of lavaan's model.syntax). Alternatively (or additionally), to test all constraints of a certain type (or multiple types) of parameter in con, param may take any combination of the following values: "loadings", "intercepts", "thresholds", "residuals", "residual.covariances", "means", "lv.variances", and/or "lv.covariances". When modelType = "mimic", param must be a vector of individual parameters or a list of character strings to be passed one-at-a-time to lavTestScore(object = con, add = param[i]), indicating which (sets of) regression paths fixed to zero in con that the user would consider freeing (i.e., exclude anchor items). If modelType = "mimic" and param is a list of character strings, the multivariate test statistic will be saved for each list element instead of 1-df modification indices for each individual parameter, and names(param) will name the rows of the MI.obs slot (see permuteMeasEq). Set param = NULL (default) to avoid collecting modification indices for any follow-up tests.
freeParam: An optional character vector, silently ignored when modelType = "mimic". If param includes a type of parameter (e.g., "loadings"), freeParam indicates exceptions (i.e., anchor items) that the user would not intend to free across groups and should therefore be ignored when calculating p values adjusted for the number of follow-up tests. Parameter types that are already unconstrained across groups in the fitted con model (i.e., a partial invariance model) will automatically be ignored, so they do not need to be specified in freeParam. Parameter names must match those returned by names(coef(con)), but omitting any group-specific suffixes (e.g., "f1~1" rather than "f1~1.g2") or user-specified labels (that is, the parameter names must follow the rules of lavaan model.syntax).
covariates: An optional character vector, only applicable when modelType = "mimic". The observed data are partitioned into columns indicated by covariates, and the rows are permuted simultaneously for the entire set before being merged with the remaining data. Thus, the covariance structure is preserved among the covariates, which is necessary when (e.g.) multiple dummy codes are used to represent a discrete covariate or when covariates interact. If covariates = NULL when modelType = "mimic", the value of covariates is inferred by searching param for predictors (i.e., variables appearing after the "~" operator).
AFIs: A character vector indicating which alternative fit indices (or chi-squared itself) are to be used to test the multiparameter omnibus null hypothesis that the constraints specified in con hold in the population. Any fit measures returned by fitMeasures may be specified (including constants like "df", which would be nonsensical). If both AFIs and moreAFIs are NULL, only "chisq" will be returned.
moreAFIs: Optional. A character vector indicating which (if any) alternative fit indices returned by moreFitIndices are to be used to test the multiparameter omnibus null hypothesis that the constraints specified in con hold in the population.
maxSparse: Only applicable when modelType = "mgcfa" and at least one indicator is ordered. An integer indicating the maximum number of consecutive times that randomly permuted group assignment can yield a sample in which at least one category (of an ordered indicator) is unobserved in at least one group, such that the same set of parameters cannot be estimated in each group. If such a sample occurs, group assignment is randomly permuted again, repeatedly until a sample is obtained with all categories observed in all groups. If maxSparse is exceeded, NA will be returned for that iteration of the permutation distribution.
maxNonconv: An integer indicating the maximum number of consecutive times that a random permutation can yield a sample for which the model does not converge on a solution. If such a sample occurs, permutation is attempted repeatedly until a sample is obtained for which the model does converge. If maxNonconv is exceeded, NA will be returned for that iteration of the permutation distribution, and a warning will be printed when using show or summary.
showProgress: Logical. Indicating whether to display a progress bar while permuting. Silently set to FALSE when using parallel options.
warn: Sets the handling of warning messages when fitting model(s) to permuted data sets. See options.
datafun: An optional function that can be applied to the data (extracted from con) after each permutation, but before fitting the model(s) to each permutation. The datafun function must have an argument named data that accepts a data.frame, and it must return a data.frame containing the same column names. The column order may differ, the values of those columns may differ (so be careful!), and any additional columns will be ignored when fitting the model, but an error will result if any column names required by the model syntax do not appear in the transformed data set. Although available for any modelType, datafun may be useful when using the MIMIC method to test for nonuniform DIF (metric/weak invariance) by using product indicators for a latent factor representing the interaction between a factor and one of the covariates, in which case the product indicators would need to be recalculated after each permutation of the covariates. To access other R objects used within permuteMeasEq, the arguments to datafun may also contain any subset of the following: "con", "uncon", "null", "param", "freeParam", "covariates", "AFIs", "moreAFIs", "maxSparse", "maxNonconv", and/or "iseed". The values for those arguments will be the same as the values supplied to permuteMeasEq.
extra: An optional function that can be applied to any (or all) of the fitted lavaan objects (con, uncon, and/or null). This function will also be applied after fitting the model(s) to each permuted data set. To access the R objects used within permuteMeasEq, the arguments to extra must be any subset of the following: "con", "uncon", "null", "param", "freeParam", "covariates", "AFIs", "moreAFIs", "maxSparse", "maxNonconv", and/or "iseed". The values for those arguments will be the same as the values supplied to permuteMeasEq. The extra function must return a named numeric vector or a named list of scalars (i.e., a list of numeric vectors of length == 1). Any unnamed elements (e.g., "" or NULL) of the returned object will result in an error.
parallelType: The type of parallel operation to be used (if any). The default is "none". Forking is not possible on Windows, so if "multicore" is requested on a Windows machine, the request will be changed to "snow" with a message.
ncpus: Integer: number of processes to be used in parallel operation. If NULL (the default) and parallelType c("multicore","snow"), the default is one less than the maximum number of processors detected by detectCores. This default is also silently set if the user specifies more than the number of processors detected.
cl: An optional parallel or snow cluster for use when parallelType = "snow". If NULL, a "PSOCK" cluster on the local machine is created for the duration of the permuteMeasEq call. If a valid makeCluster object is supplied, parallelType is silently set to "snow", and ncpus is silently set to length(cl).
iseed: Integer: Only used to set the states of the RNG when using parallel options, in which case RNGkind is set to "L'Ecuyer-CMRG" with a message. See clusterSetRNGStream and Section 6 of vignette("parallel", "parallel") for more details. If user supplies an invalid value, iseed is silently set to the default (12345). To set the state of the RNG when not using parallel options, call set.seed before calling permuteMeasEq.

Author

Terrence D. Jorgensen (University of Amsterdam; TJorgensen314@gmail.com)

Details

The function permuteMeasEq provides tests of hypotheses involving measurement equivalence, in one of two frameworks:

1 For multiple-group CFA models, provide a pair of nested lavaan objects, the less constrained of which (uncon) freely estimates a set of measurement parameters (e.g., factor loadings, intercepts, or thresholds; specified in param) in all groups, and the more constrained of which (con) constrains those measurement parameters to equality across groups. Group assignment is repeatedly permuted and the models are fit to each permutation, in order to produce an empirical distribution under the null hypothesis of no group differences, both for (a) changes in user-specified fit measures (see AFIs and moreAFIs) and for (b) the maximum modification index among the user-specified equality constraints. Configural invariance can also be tested by providing that fitted lavaan object to con and leaving uncon = NULL, in which case param must be NULL as well.
2 In MIMIC models, one or a set of continuous and/or discrete covariates can be permuted, and a constrained model is fit to each permutation in order to provide a distribution of any fit measures (namely, the maximum modification index among fixed parameters in param) under the null hypothesis of measurement equivalence across levels of those covariates.

In either framework, modification indices for equality constraints or fixed parameters specified in param are calculated from the constrained model (con) using the function lavTestScore.

For multiple-group CFA models, the multiparameter omnibus null hypothesis of measurement equivalence/invariance is that there are no group differences in any measurement parameters (of a particular type). This can be tested using the anova method on nested lavaan objects, as seen in the output of measurementInvariance, or by inspecting the change in alternative fit indices (AFIs) such as the CFI. The permutation randomization method employed by permuteMeasEq generates an empirical distribution of any AFIs under the null hypothesis, so the user is not restricted to using fixed cutoffs proposed by Cheung & Rensvold (2002), Chen (2007), or Meade, Johnson, & Braddy (2008).

If the multiparameter omnibus null hypothesis is rejected, partial invariance can still be established by freeing invalid equality constraints, as long as equality constraints are valid for at least two indicators per factor. Modification indices can be calculated from the constrained model (con), but multiple testing leads to inflation of Type I error rates. The permutation randomization method employed by permuteMeasEq creates a distribution of the maximum modification index if the null hypothesis is true, which allows the user to control the familywise Type I error rate in a manner similar to Tukey's q (studentized range) distribution for the Honestly Significant Difference (HSD) post hoc test.

For MIMIC models, DIF can be tested by comparing modification indices of regression paths to the permutation distribution of the maximum modification index, which controls the familywise Type I error rate. The MIMIC approach could also be applied with multiple-group models, but the grouping variable would not be permuted; rather, the covariates would be permuted separately within each group to preserve between-group differences. So whether parameters are constrained or unconstrained across groups, the MIMIC approach is only for testing null hypotheses about the effects of covariates on indicators, controlling for common factors.

In either framework, lavaan's group.label argument is used to preserve the order of groups seen in con when permuting the data.

References

Papers about permutation tests of measurement equivalence:

Jorgensen, T. D., Kite, B. A., Chen, P.-Y., & Short, S. D. (2018). Permutation randomization methods for testing measurement equivalence and detecting differential item functioning in multiple-group confirmatory factor analysis. Psychological Methods, 23(4), 708--728. tools:::Rd_expr_doi("10.1037/met0000152")

Kite, B. A., Jorgensen, T. D., & Chen, P.-Y. (2018). Random permutation testing applied to measurement invariance testing with ordered-categorical indicators. Structural Equation Modeling 25(4), 573--587. tools:::Rd_expr_doi("10.1080/10705511.2017.1421467")

Jorgensen, T. D. (2017). Applying permutation tests and multivariate modification indices to configurally invariant models that need respecification. Frontiers in Psychology, 8(1455). tools:::Rd_expr_doi("10.3389/fpsyg.2017.01455")

Additional reading:

Chen, F. F. (2007). Sensitivity of goodness of fit indexes to lack of measurement invariance. Structural Equation Modeling, 14(3), 464--504. tools:::Rd_expr_doi("10.1080/10705510701301834")

Cheung, G. W., & Rensvold, R. B. (2002). Evaluating goodness-of-fit indexes for testing measurement invariance. Structural Equation Modeling, 9(2), 233--255. tools:::Rd_expr_doi("10.1207/S15328007SEM0902_5")

Meade, A. W., Johnson, E. C., & Braddy, P. W. (2008). Power and sensitivity of alternative fit indices in tests of measurement invariance. Journal of Applied Psychology, 93(3), 568--592. tools:::Rd_expr_doi("10.1037/0021-9010.93.3.568")

Widamin, K. F., & Thompson, J. S. (2003). On specifying the null model for incremental fit indices in structural equation modeling. Psychological Methods, 8(1), 16--37. tools:::Rd_expr_doi("10.1037/1082-989X.8.1.16")

Examples

Run this code


if (FALSE) {

########################
## Multiple-Group CFA ##
########################

## create 3-group data in lavaan example(cfa) data
HS <- lavaan::HolzingerSwineford1939
HS$ageGroup <- ifelse(HS$ageyr < 13, "preteen",
                      ifelse(HS$ageyr > 13, "teen", "thirteen"))

## specify and fit an appropriate null model for incremental fit indices
mod.null <- c(paste0("x", 1:9, " ~ c(T", 1:9, ", T", 1:9, ", T", 1:9, ")*1"),
              paste0("x", 1:9, " ~~ c(L", 1:9, ", L", 1:9, ", L", 1:9, ")*x", 1:9))
fit.null <- cfa(mod.null, data = HS, group = "ageGroup")

## fit target model with varying levels of measurement equivalence
mod.config <- '
visual  =~ x1 + x2 + x3
textual =~ x4 + x5 + x6
speed   =~ x7 + x8 + x9
'
fit.config <- cfa(mod.config, data = HS, std.lv = TRUE, group = "ageGroup")
fit.metric <- cfa(mod.config, data = HS, std.lv = TRUE, group = "ageGroup",
                  group.equal = "loadings")
fit.scalar <- cfa(mod.config, data = HS, std.lv = TRUE, group = "ageGroup",
                  group.equal = c("loadings","intercepts"))


####################### Permutation Method

## fit indices of interest for multiparameter omnibus test
myAFIs <- c("chisq","cfi","rmsea","mfi","aic")
moreAFIs <- c("gammaHat","adjGammaHat")

## Use only 20 permutations for a demo.  In practice,
## use > 1000 to reduce sampling variability of estimated p values

## test configural invariance
set.seed(12345)
out.config <- permuteMeasEq(nPermute = 20, con = fit.config)
out.config

## test metric equivalence
set.seed(12345) # same permutations
out.metric <- permuteMeasEq(nPermute = 20, uncon = fit.config, con = fit.metric,
                            param = "loadings", AFIs = myAFIs,
                            moreAFIs = moreAFIs, null = fit.null)
summary(out.metric, nd = 4)

## test scalar equivalence
set.seed(12345) # same permutations
out.scalar <- permuteMeasEq(nPermute = 20, uncon = fit.metric, con = fit.scalar,
                            param = "intercepts", AFIs = myAFIs,
                            moreAFIs = moreAFIs, null = fit.null)
summary(out.scalar)

## Not much to see without significant DIF.
## Try using an absurdly high alpha level for illustration.
outsum <- summary(out.scalar, alpha = .50)

## notice that the returned object is the table of DIF tests
outsum

## visualize permutation distribution
hist(out.config, AFI = "chisq")
hist(out.metric, AFI = "chisq", nd = 2, alpha = .01,
     legendArgs = list(x = "topright"))
hist(out.scalar, AFI = "cfi", printLegend = FALSE)


####################### Extra Output

## function to calculate expected change of Group-2 and -3 latent means if
## each intercept constraint were released
extra <- function(con) {
  output <- list()
  output["x1.vis2"] <- lavTestScore(con, release = 19:20, univariate = FALSE,
                                    epc = TRUE, warn = FALSE)$epc$epc[70]
  output["x1.vis3"] <- lavTestScore(con, release = 19:20, univariate = FALSE,
                                    epc = TRUE, warn = FALSE)$epc$epc[106]
  output["x2.vis2"] <- lavTestScore(con, release = 21:22, univariate = FALSE,
                                    epc = TRUE, warn = FALSE)$epc$epc[70]
  output["x2.vis3"] <- lavTestScore(con, release = 21:22, univariate = FALSE,
                                    epc = TRUE, warn = FALSE)$epc$epc[106]
  output["x3.vis2"] <- lavTestScore(con, release = 23:24, univariate = FALSE,
                                    epc = TRUE, warn = FALSE)$epc$epc[70]
  output["x3.vis3"] <- lavTestScore(con, release = 23:24, univariate = FALSE,
                                    epc = TRUE, warn = FALSE)$epc$epc[106]
  output["x4.txt2"] <- lavTestScore(con, release = 25:26, univariate = FALSE,
                                    epc = TRUE, warn = FALSE)$epc$epc[71]
  output["x4.txt3"] <- lavTestScore(con, release = 25:26, univariate = FALSE,
                                    epc = TRUE, warn = FALSE)$epc$epc[107]
  output["x5.txt2"] <- lavTestScore(con, release = 27:28, univariate = FALSE,
                                    epc = TRUE, warn = FALSE)$epc$epc[71]
  output["x5.txt3"] <- lavTestScore(con, release = 27:28, univariate = FALSE,
                                    epc = TRUE, warn = FALSE)$epc$epc[107]
  output["x6.txt2"] <- lavTestScore(con, release = 29:30, univariate = FALSE,
                                    epc = TRUE, warn = FALSE)$epc$epc[71]
  output["x6.txt3"] <- lavTestScore(con, release = 29:30, univariate = FALSE,
                                    epc = TRUE, warn = FALSE)$epc$epc[107]
  output["x7.spd2"] <- lavTestScore(con, release = 31:32, univariate = FALSE,
                                    epc = TRUE, warn = FALSE)$epc$epc[72]
  output["x7.spd3"] <- lavTestScore(con, release = 31:32, univariate = FALSE,
                                    epc = TRUE, warn = FALSE)$epc$epc[108]
  output["x8.spd2"] <- lavTestScore(con, release = 33:34, univariate = FALSE,
                                    epc = TRUE, warn = FALSE)$epc$epc[72]
  output["x8.spd3"] <- lavTestScore(con, release = 33:34, univariate = FALSE,
                                    epc = TRUE, warn = FALSE)$epc$epc[108]
  output["x9.spd2"] <- lavTestScore(con, release = 35:36, univariate = FALSE,
                                    epc = TRUE, warn = FALSE)$epc$epc[72]
  output["x9.spd3"] <- lavTestScore(con, release = 35:36, univariate = FALSE,
                                    epc = TRUE, warn = FALSE)$epc$epc[108]
  output
}

## observed EPC
extra(fit.scalar)

## permutation results, including extra output
set.seed(12345) # same permutations
out.scalar <- permuteMeasEq(nPermute = 20, uncon = fit.metric, con = fit.scalar,
                            param = "intercepts", AFIs = myAFIs,
                            moreAFIs = moreAFIs, null = fit.null, extra = extra)
## summarize extra output
summary(out.scalar, extra = TRUE)


###########
## MIMIC ##
###########

## Specify Restricted Factor Analysis (RFA) model, equivalent to MIMIC, but
## the factor covaries with the covariate instead of being regressed on it.
## The covariate defines a single-indicator construct, and the
## double-mean-centered products of the indicators define a latent
## interaction between the factor and the covariate.
mod.mimic <- '
visual  =~ x1 + x2 + x3
age =~ ageyr
age.by.vis =~ x1.ageyr + x2.ageyr + x3.ageyr

x1 ~~ x1.ageyr
x2 ~~ x2.ageyr
x3 ~~ x3.ageyr
'

HS.orth <- indProd(var1 = paste0("x", 1:3), var2 = "ageyr", match = FALSE,
                   data = HS[ , c("ageyr", paste0("x", 1:3))] )
fit.mimic <- cfa(mod.mimic, data = HS.orth, meanstructure = TRUE)
summary(fit.mimic, stand = TRUE)

## Whereas MIMIC models specify direct effects of the covariate on an indicator,
## DIF can be tested in RFA models by specifying free loadings of an indicator
## on the covariate's construct (uniform DIF, scalar invariance) and the
## interaction construct (nonuniform DIF, metric invariance).
param <- as.list(paste0("age + age.by.vis =~ x", 1:3))
names(param) <- paste0("x", 1:3)
# param <- as.list(paste0("x", 1:3, " ~ age + age.by.vis")) # equivalent

## test both parameters simultaneously for each indicator
do.call(rbind, lapply(param, function(x) lavTestScore(fit.mimic, add = x)$test))
## or test each parameter individually
lavTestScore(fit.mimic, add = as.character(param))


####################### Permutation Method

## function to recalculate interaction terms after permuting the covariate
datafun <- function(data) {
  d <- data[, c(paste0("x", 1:3), "ageyr")]
  indProd(var1 = paste0("x", 1:3), var2 = "ageyr", match = FALSE, data = d)
}

set.seed(12345)
perm.mimic <- permuteMeasEq(nPermute = 20, modelType = "mimic",
                            con = fit.mimic, param = param,
                            covariates = "ageyr", datafun = datafun)
summary(perm.mimic)

}

Run the code above in your browser using DataLab