multilevel.cor: Within-Group and Between-Group Correlation Matrix

Description

This function is a wrapper function for computing the within-group and between-group correlation matrix by calling the sem function in the R package lavaan and provides standard errors, z test statistics, and significance values (p-values) for testing the hypothesis H0: \(\rho\) = 0 for all pairs of variables within and between groups.

Usage

multilevel.cor(..., data = NULL, cluster, within = NULL, between = NULL,
               estimator = c("ML", "MLR"), optim.method = c("nlminb", "em"),
               missing = c("listwise", "fiml"), sig = FALSE, alpha = 0.05,
               print = c("all", "cor", "se", "stat", "p"), split = FALSE,
               order = FALSE, tri = c("both", "lower", "upper"), tri.lower = TRUE,
               p.adj = c("none", "bonferroni", "holm", "hochberg", "hommel",
                         "BH", "BY", "fdr"), digits = 2, p.digits = 3,
               as.na = NULL, write = NULL, append = TRUE, check = TRUE,
               output = TRUE)

Value

Returns an object of class misty.object, which is a list with following entries:

call: function call
type: type of analysis
data: data frame specified in x including the group variable specified in cluster
args: specification of function arguments
model.fit: fitted lavaan object (mod.fit)
result: list with result tables, i.e., summary for the specification of the estimation method and missing data handling in lavaan, wb.cor for the within- and between-group correlations, wb.se for the standard error of the within- and between-group correlations, wb.stat for the test statistic of within- and between-group correlations, wb.p for the significance value of the within- and between-group correlations, with.cor for the within-group correlations, with.se for the standard error of the within-group correlations, with.stat for the test statistic of within-group correlations, with.p for the significance value of the within-group correlations, betw.cor for the between-group correlations, betw.se for the standard error of the between-group correlations, betw.stat for the test statistic of between-group correlations, betw.p for the significance value of the between-group correlations

Arguments

...: a matrix or data frame. Alternatively, an expression indicating the variable names in data e.g., multilevel.cor(x1, x2, x3, data = dat). Note that the operators ., +, -, ~, :, ::, and ! can also be used to select variables, see 'Details' in the df.subset function.
data: a data frame when specifying one or more variables in the argument .... Note that the argument is NULL when specifying a matrix or data frame for the argument ....
cluster: either a character string indicating the variable name of the cluster variable in ... or data, or a vector representing the nested grouping structure (i.e., group or cluster variable).
within: a character vector representing variables that are measured on the within level and modeled only on the within level. Variables not mentioned in within or between are measured on the within level and will be modeled on both the within and between level.
between: a character vector representing variables that are measured on the between level and modeled only on the between level. Variables not mentioned in within or between are measured on the within level and will be modeled on both the within and between level.
estimator: a character string indicating the estimator to be used: "ML" (default) for maximum likelihood with conventional standard errors and "MLR" for maximum likelihood with Huber-White robust standard errors. Note that by default, full information maximum likelihood (FIML) method is used to deal with missing data when using "ML" (missing = "fiml"), whereas incomplete cases are removed listwise (i.e., missing = "listwise") when using "MLR".
optim.method: a character string indicating the optimizer, i.e., nlminb (default) for the unconstrained and bounds-constrained quasi-Newton method optimizer and "em" for the Expectation Maximization (EM) algorithm.
missing: a character string indicating how to deal with missing data, i.e., "listwise" for listwise deletion or "fiml" (default) for full information maximum likelihood (FIML) method. Note that FIML method is only available when estimator = "ML". Note that it takes longer to estimate the model when using FIML and using FIML might cause issues in model convergence, these issues might be resolved by switching to listwise deletion.
sig: logical: if TRUE, statistically significant correlation coefficients are shown in boldface on the console.
alpha: a numeric value between 0 and 1 indicating the significance level at which correlation coefficients are printed boldface when sig = TRUE.
print: a character string or character vector indicating which results to show on the console, i.e. "all" for all results, "cor" for correlation coefficients, "se" for standard errors, "stat" for z test statistics, and "p" for p-values.
split: logical: if TRUE, output table is split in within-group and between-group correlation matrix.
order: logical: if TRUE, variables in the output table are ordered, so that variables specified in the argument between are shown first.
tri: a character string indicating which triangular of the matrix to show on the console when split = TRUE, i.e., both for upper and upper for the upper triangular.
tri.lower: logical: if TRUE (default) and split = FALSE (default), within-group correlations are shown in the lower triangular and between-group correlation are shown in the upper triangular.
p.adj: a character string indicating an adjustment method for multiple testing based on p.adjust, i.e., none (default), bonferroni, holm, hochberg, hommel, BH, BY, or fdr.
digits: an integer value indicating the number of decimal places to be used for displaying correlation coefficients.
p.digits: an integer value indicating the number of decimal places to be used for displaying p-values.
as.na: a numeric vector indicating user-defined missing values, i.e. these values are converted to NA before conducting the analysis. Note that as.na() function is only applied to x but not to cluster.
write: a character string naming a file for writing the output into either a text file with file extension ".txt" (e.g., "Output.txt") or Excel file with file extension ".xlsx" (e.g., "Output.xlsx"). If the file name does not contain any file extension, an Excel file will be written.
append: logical: if TRUE (default), output will be appended to an existing text file with extension .txt specified in write, if FALSE existing text file will be overwritten.
check: logical: if TRUE (default), argument specification is checked.
output: logical: if TRUE (default), output is shown on the console.

Author

Takuya Yanagida takuya.yanagida@univie.ac.at

Details

The specification of the within-group and between-group variables is in line with the syntax in Mplus. That is, the within argument is used to identify the variables in the matrix or data frame specified in x that are measured on the individual level and modeled only on the within level. They are specified to have no variance in the between part of the model. The between argument is used to identify the variables in the matrix or data frame specified in x that are measured on the cluster level and modeled only on the between level. Variables not mentioned in the arguments within or between are measured on the individual level and will be modeled on both the within and between level.

The function uses maximum likelihood estimation with conventional standard errors (estimator = "ML") which are not robust against non-normality and full information maximum likelihood (FIML) method (missing = "fiml") to deal with missing data by default. FIML method cannot be used when within-group variables have no variance within some clusters. In this cases, the function will switch to listwise deletion. Note that the current lavaan version 0.6-11 supports FIML method only for maximum likelihood estimation with conventional standard errors (estimator = "ML") in multilevel models. Maximum likelihood estimation with Huber-White robust standard errors (estimator = "MLR") uses listwise deletion to deal with missing data. When using FIML method there might be issues in model convergence, which might be resolved by switching to listwise deletion (missing = "listwise").

The lavaan package uses a quasi-Newton optimization method ("nlminb") by default. If the optimizer does not converge, model estimation will switch to the Expectation Maximization (EM) algorithm.

Statistically significant correlation coefficients can be shown in boldface on the console when specifying sig = TRUE. However, this option is not supported when using R Markdown, i.e., the argument sig will switch to FALSE.

Adjustment method for multiple testing when specifying the argument p.adj is applied to the within-group and between-group correlation matrix separately.

References

Hox, J., Moerbeek, M., & van de Schoot, R. (2018). Multilevel analysis: Techniques and applications (3rd. ed.). Routledge.

Snijders, T. A. B., & Bosker, R. J. (2012). Multilevel analysis: An introduction to basic and advanced multilevel modeling (2nd ed.). Sage Publishers.