mplus.bayes: Mplus Summary Measures, Convergence and Efficiency Diagnostics

Description

This function uses the h5file function in the hdf5r package to read a Mplus GH5 file that is requested by the command PLOT: TYPE IS PLOT2 in Mplus to compute point estimates (i.e., mean, median, and MAP), measures of dispersion (i.e., standard deviation and mean absolute deviation), measures of shape (i.e., skewness and kurtosis), credible intervals (i.e., equal-tailed intervals and highest density interval), convergence and efficiency diagnostics (i.e., potential scale reduction factor R-hat, effective sample size, and Monte Carlo standard error), probability of direction, and probability of being in the region of practical equivalence for the posterior distribution for each parameter. By default, the function computes the maximum of rank-normalized split-R-hat and rank normalized folded-split-R-hat, Bulk effective sample size (Bulk-ESS) for rank-normalized values using split chains, tail effective sample size (Tail-ESS) defined as the minimum of the effective sample size for 0.025 and 0.975 quantiles, the Bulk Monte Carlo standard error (Bulk-MCSE) for the median and Tail Monte Carlo standard error (Tail-MCSE) defined as the maximum of the MCSE for 0.025 and 0.975 quantiles.

Usage

mplus.bayes(x,
            print = c("all", "default", "m", "med", "map", "sd", "mad",
                      "skew", "kurt", "eti", "hdi",
                      "rhat", "b.ess", "t.ess", "b.mcse", "t.mcse"),
            param = c("all", "on", "by", "with", "inter", "var", "r2", "new"),
            std = c("all", "none", "stdyx", "stdy", "std"),
            m.bulk = FALSE, split = TRUE, rank = TRUE, fold = TRUE,
            pd = FALSE, null = 0, rope = NULL,
            ess.tail = c(0.025, 0.975), mcse.tail = c(0.025, 0.975),
            alternative = c("two.sided", "less", "greater"),
            conf.level = 0.95, digits = 2, r.digits = 3, ess.digits = 0,
            mcse.digits = 3, p.digits = 3, write = NULL, append = TRUE,
            check = TRUE, output = TRUE)

Value

Returns an object of class misty.object, which is a list with following entries:

call: function call
type: type of analysis
x: Mplus GH5 file
args: specification of function arguments
data: three-dimensional array parameter x iteration x chain of the posterior
result: result table with summary measures, convergence, and efficiency diagnostics

Arguments

x: a character string indicating the name of the Mplus GH5 file (HDF5 format) with or without the file extension .gh5, e.g., "Mplus_Plot.gh5" or "Mplus_Plot".
print: a character vector indicating which summary measures, convergence, and efficiency diagnostics to be printed on the console, i.e. "all" for all summary measures, convergence, and efficiency diagnostics, "m" for the mean, "med" for the median, "MAP" for the maximum a posteriori probability estimate, "med" for the standard deviation, "mad" for the mean absolute deviation, "skew" for the skewness, "kurt" for the kurtosis, "eti" for the equal-tailed credible interval, "hdi" for the highest density credible interval, "rhat" for the potential scale reduction (PSR) factor R-hat convergence diagnostic, "b.ess" for the bulk effective sample size (ESS), "t.ess" for the tail ESS, "b.mcse" for the bulk Monte Carlo standard error (MCSE), and "t.mcse" for the tail MCSE. The default setting is print = c("med", "sd", "skew", "kurt", "eti", "rhat", "b.ess", "t.ess", "b.mcse", "t.mcse").
param: character vector indicating which parameters to print for the summary measures, convergence, and efficiency diagnostics, i.e., "all" for all parameters, "on" (default), for regression slopes, "by" for factor loadings, "with" for covariances, "inter" for intercepts and thresholds, "var" for (residual) variances, "r2" for r-square, and "new" for parameters not in the analysis model specified in the NEW option. The default setting is "on" if regression slopes are available. Otherwise, the default setting switches to "by" and to "with" if factor loadings are not available.
std: a character vector indicating the standardized parameters to print for the summary measures, convergence, and efficiency diagnostics, i.e., "all" for all standardized parameters, "none" (default) for not printing any standardized parameters, "stdyx" for StdYX standardized parameters, "stdy" for StdY standardized parameters, and "std" for StdX standardized parameters.
m.bulk: logical: if TRUE the Monte Carlo standard error for the mean is computed. The default setting is m.bulk = FALSE, i.e., the Monte Carlo standard error for the median is computed.
split: logical: if TRUE (default), each MCMC chain is split in half before computing R-hat. Note that the argument split is always set to FALSE when computing ESS.
rank: logical: if TRUE (default), rank-normalization is applied to the posterior draws before computing R-hat and ESS. Note that the argument rank is always set to FALSE when computing MCSE.
fold: logical: if TRUE (default), the maximum of rank-normalized split-R-hat and rank normalized folded-split-R-hat is computed. Note that the arguments split and rank are always set to TRUE when specifying fold = TRUE.
pd: logical: if TRUE, the probability of direction is printed on the console.
null: a numeric value considered as a null effect for the probability of direction (default is 0). Note that the value specified in the argument null applies to all parameters which might not be sensible for all parameters.
rope: a numeric vector with two elements indicating the ROPE's lower and upper bounds. ROPE is also depending on the argument alternative, e.g., if rope = c(-0.1, 0.1), then the actual ROPE is [-0.1, 0.1] given alternative = "two.sided} (default), \code{[-Inf, 0.1]} given \code{alternative = "greater, and [-0.1, Inf] given alternative = "less". Note that the interval specified in the argument rope applies to all parameters which might not be sensible for all parameters.
ess.tail: a numeric vector with two elements to specify the quantiles for computing the tail ESS. The default setting is tail = c(0.025, 0.975), i.e., tail ESS is the minimum of effective sample sizes for 5% and 95% quantiles.
mcse.tail: a numeric vector with two elements to specify the quantiles for computing the tail MCSE. The default setting is tail = c(0.025, 0.975), i.e., tail MCSE is the maximum of Monte Carlo standard error for 5% and 95% quantiles.
alternative: a character string specifying the alternative hypothesis for the credible intervals, must be one of "two.sided" (default), "greater" or "less".
conf.level: a numeric value between 0 and 1 indicating the confidence level of the credible interval. The default setting is conf.level = 0.95.
digits: an integer value indicating the number of decimal places to be used for displaying point estimates, measures of dispersion, and credible intervals.
r.digits: an integer value indicating the number of decimal places to be used for displaying R-hat values.
ess.digits: an integer value indicating the number of decimal places to be used for displaying effective sample sizes.
mcse.digits: an integer value indicating the number of decimal places to be used for displaying Monte Carlo standard errors.
p.digits: an integer value indicating the number of decimal places to be used for displaying the probability of direction and the probability of being in the region of practical equivalence (ROPE).
write: a character string naming a file for writing the output into either a text file with file extension ".txt" (e.g., "Output.txt") or Excel file with file extension ".xlsx" (e.g., "Output.xlsx"). If the file name does not contain any file extension, an Excel file will be written.
append: logical: if TRUE (default), output will be appended to an existing text file with extension .txt specified in write, if FALSE existing text file will be overwritten.
check: logical: if TRUE (default), argument specification is checked.
output: logical: if TRUE (default), output is shown on the console by using the function mplus.print().

Author

Takuya Yanagida

Details

Convergence and Efficiency Diagnostics for Markov Chains

Convergence and efficiency diagnostics for Markov chains is based on following numeric measures:

Potential Scale Reduction (PSR) factor R-hat: The PSR factor R-hat compares the between- and within-chain variance for a model parameter, i.e., R-hat larger than 1 indicates that the between-chain variance is greater than the within-chain variance and chains have not mixed well. According to the default setting, the function computes the improved R-hat as recommended by Vehtari et al. (2020) based on rank-normalizing (i.e., rank = TRUE) and folding (i.e., fold = TRUE) the posterior draws after splitting each MCMC chain in half (i.e., split = TRUE). The traditional R-hat used in Mplus can be requested by specifying split = FALSE, rank = FALSE, and fold = FALSE. Note that the traditional R-hat can catch many problems of poor convergence, but fails if the chains have different variances with the same mean parameter or if the chains have infinite variance with one of the chains having a different location parameter to the others (Vehtari et al., 2020). According to Gelman et al. (2014) a R-hat value of 1.1 or smaller for all parameters can be considered evidence for convergence. The Stan Development Team (2024) recommends running at least four chains and a convergence criterion of less than 1.05 for the maximum of rank normalized split-R-hat and rank normalized folded-split-R-hat. Vehtari et al. (2020), however, recommended to only use the posterior samples if R-hat is less than 1.01 because the R-hat can fall below 1.1 well before convergence in some scenarios (Brooks & Gelman, 1998; Vats & Knudon, 2018).
Effective Sample Size (ESS): The ESS is the estimated number of independent samples from the posterior distribution that would lead to the same precision as the autocorrelated samples at hand. According to the default setting, the function computes the ESS based on rank-normalized split-R-hat and within-chain autocorrelation. The function provides the estimated Bulk-ESS (B.ESS) and the Tail-ESS (T.ESS). The Bulk-ESS is a useful measure for sampling efficiency in the bulk of the distribution (i.e, efficiency of the posterior mean), and the Tail-ESS is useful measure for sampling efficiency in the tails of the distribution (e.g., efficiency of tail quantile estimates). Note that by default, the Tail-ESS is the minimum of the effective sample sizes for 5% and 95% quantiles (tail = c(0.025, 0.975)). According to Kruschke (2015), a rank-normalized ESS greater than 400 is usually sufficient to get a stable estimate of the Monte Carlo standard error. However, a ESS of at least 1000 is considered optimal (Zitzmann & Hecht, 2019).
Monte Carlo Standard Error (MCSE): The MCSE is defined as the standard deviation of the chains divided by their effective sample size and reflects uncertainty due to the stochastic algorithm of the Markov Chain Monte Carlo method. The function provides the estimated Bulk-MCSE (B.MCSE) for the margin of error when using the MCMC samples to estimate the posterior mean and the Tail-ESS (T.MCSE) for the margin of error when using the MCMC samples for interval estimation.

References

Brooks, S. P. and Gelman, A. (1998). General Methods for Monitoring Convergence of Iterative Simulations. Journal of Computational and Graphical Statistics, 7(4): 434–455. MR1665662.

Gelman, A., & Rubin, D.B. (1992). Inference from iterative simulation using multiple sequences. Statistical Science, 7, 457-472. https://doi.org/10.1214/ss/1177011136

Kruschke, J. (2015). Doing Bayesian data analysis: A tutorial with R, JAGS, and Stan. Academic Press.

Makowski, D., Ben-Shachar, M., & Lüdecke, D. (2019). bayestestR: Describing effects and their uncertainty, existence and significance within the Bayesian framework. Journal of Open Source Software, 4(40), 1541. https://doi.org/10.21105/joss.01541

Stan Development Team (2024). RStan: the R interface to Stan. R package version 2.32.6. https://mc-stan.org/.

Vats, D. and Knudson, C. (2018). Revisiting the Gelman-Rubin Diagnostic. arXiv:1812.09384.

Vehtari, A., Gelman, A., Simpson, D., Carpenter, B., & Bürkner, P.-C. (2020). Rank-normalization, folding, and localization: An improved R-hat for assessing convergence of MCMC. Bayesian analysis, 16(2), 667-718. https://doi.org/110.1214/20-BA1221

Zitzmann, S., & Hecht, M. (2019). Going beyond convergence in Bayesian estimation: Why precision matters too and how to assess it. Structural Equation Modeling: A Multidisciplinary Journal, 26(4), 646–661. https://doi.org/10.1080/10705511.2018.1545232

Examples

Run this code

if (FALSE) {
#----------------------------------------------------------------------------
# Mplus Example 3.18: Moderated Mediation with a Plot of the Indirect Effect

# Example 1: Default setting
mplus.bayes("ex3.18.gh5")

# Example 2: Print all parameters
mplus.bayes("ex3.18.gh5", param = "all")

# Example 3: Print parameters not in the analysis model
mplus.bayes("ex3.18.gh5", param = "new")

# Example 4a: Print all summary measures, convergence, and efficiency diagnostics
mplus.bayes("ex3.18.gh5", print = "all")

# Example 4a: Print default measures plus MAP
mplus.bayes("ex3.18.gh5", print = c("default", "map"))

# Example 5: Print traditional R-hat in line with Mplus
mplus.bayes("ex3.18.gh5", split = FALSE, rank = FALSE, fold = FALSE)

# Example 6: Print probability of direction and the probability of
# being ROPE [-0.1, 0.1]
mplus.bayes("ex3.18.gh5", pd = TRUE, rope = c(-0.1, 0.1))

# Example 7: Write Results into a text file
mplus.bayes("ex3.18.gh5", write = "Bayes_Summary.txt")

# Example 8b: Write Results into a Excel file
mplus.bayes("ex3.18.gh5", write = "Bayes_Summary.xlsx")
}

Run the code above in your browser using DataLab