hdi: Compute statistics for MCMC samples and Stan models

Description

hdi() computes the highest density interval for values from MCMC samples, while cred_int() computes the credible interval (or uncertainty interval). rope() calculates the proportion of a posterior distribution that lies within a region of practical equivalence. equi_test() combines these two functions and performs a "HDI+ROPE decision rule" (Test for Practical Equivalence) (Kruschke 2018) to check whether parameter values should be accepted or rejected against the background of a formulated null hypothesis. n_eff() calculates the the number of effective samples (effective sample size). mcse() returns the Monte Carlo standard error. mediation() is a short summary for multivariate-response mediation-models.

Usage

hdi(x, ...)
# S3 method for stanreg
hdi(x, prob = 0.9, trans = NULL, type = c("fixed",
  "random", "all"), ...)
# S3 method for brmsfit
hdi(x, prob = 0.9, trans = NULL, type = c("fixed",
  "random", "all"), ...)
cred_int(x, ...)
# S3 method for stanreg
cred_int(x, prob = 0.9, trans = NULL,
  type = c("fixed", "random", "all"), ...)
# S3 method for brmsfit
cred_int(x, prob = 0.9, trans = NULL,
  type = c("fixed", "random", "all"), ...)
equi_test(x, ...)
# S3 method for stanreg
equi_test(x, rope, eff_size, out = c("txt", "viewer",
  "browser", "plot"), ...)
# S3 method for brmsfit
equi_test(x, rope, eff_size, out = c("txt", "viewer",
  "browser", "plot"), ...)
mcse(x, ...)
# S3 method for brmsfit
mcse(x, type = c("fixed", "random", "all"), ...)
# S3 method for stanreg
mcse(x, type = c("fixed", "random", "all"), ...)
mediation(x, ...)
# S3 method for brmsfit
mediation(x, treatment, mediator, prob = 0.9,
  typical = "median", ...)
n_eff(x, ...)
# S3 method for stanreg
n_eff(x, type = c("fixed", "random", "all"), ...)
# S3 method for brmsfit
n_eff(x, type = c("fixed", "random", "all"), ...)
rope(x, rope, ...)
# S3 method for stanreg
rope(x, rope, trans = NULL, type = c("fixed",
  "random", "all"), ...)
# S3 method for brmsfit
rope(x, rope, trans = NULL, type = c("fixed",
  "random", "all"), ...)

Arguments

A stanreg, stanfit, or brmsfit object. For hdi() and rope(), may also be a data frame or a vector of values from a probability distribution (e.g., posterior probabilities from MCMC sampling).

...

Further arguments passed down to equi_test() when plot = TRUE:

colors: Color of the density regions for the 95% distribution of the posterior samples.
rope.color and rope.alpha: Fill color and alpha-value of the ROPE (region of practical equivalence).
x.title: Title for the x-axis of the plot.
legend.title: Title for the plot legend.
labels: Character vector of same length as terms plotted on the y-axis, to give axis labels user-defined labels.

prob

Vector of scalars between 0 and 1, indicating the mass within the credible interval that is to be estimated. See hdi.

trans

Name of a function or character vector naming a function, used to apply transformations on the returned HDI-values resp. (for rope()) on the values of the posterior distribution, before calculating the rope based on the boundaries given in rope. Note that the values in rope are not transformed.

type

For mixed effects models, specify the type of effects that should be returned. type = "fixed" returns fixed effects only, type = "random" the random effects and type = "all" returns both fixed and random effects.

rope

Vector of length two, indicating the lower and upper limit of a range around zero, which indicates the region of practical equivalence. Values of the posterior distribution within this range are considered as being "practically equivalent to zero".

eff_size

A scalar indicating the effect size (the size of an negligible effect) that is used to calculate the limits of the ROPE for the test of practical equivalence. If not specified, an effect size of .1 is used for linear models, as suggested by Kruschke 2018 (see 'Details'). If rope is specified, this argument will be ignored.

out

Character vector, indicating whether the results should be printed to console (out = "txt") or as HTML-table in the viewer-pane (out = "viewer") or browser (out = "browser"), of if the results should be plotted (out = "plot", only applies to certain functions). May be abbreviated.

treatment

Character, name of the treatment variable (or direct effect) in a (multivariate response) mediator-model. If missing, mediation() tries to find the treatment variable automatically, however, this may fail.

mediator

Character, name of the mediator variable in a (multivariate response) mediator-model. If missing, mediation() tries to find the treatment variable automatically, however, this may fail.

typical

The typical value that will represent the Bayesian point estimate. By default, the posterior median is returned. See typical_value for possible values for this argument.

Value

For hdi(), if x is a vector, returns a vector of length two with the lower and upper limit of the HDI; if x is a stanreg, stanfit or brmsfit object, returns a tibble with lower and upper HDI-limits for each predictor. To distinguish multiple HDI values, column names for the HDI get a suffix when prob has more than one element. For rope(), returns a tibble with two columns: the proportion of values from x that are within and outside the boundaries of rope. equi_test() returns a tibble with a column decision that indicates whether or not a parameter value is accepted/rejected; inside.rope, which indicates the proportion of the whole posterior distribution that lies inside the ROPE (not just the proportion of values from the 95% HDI); and the lower and upper interval from the 95%-HDI. mcse() and n_eff() return a tibble with two columns: one with the term names and one with the related statistic resp. effective sample size. mediation() returns a data frame with direct, indirect, mediator and total effect of a multivariate-response mediation-model, as well as the proportion mediated. The effect sizes are mean values of the posterior samples.

Details

HDI

Computation for HDI is based on the code from Kruschke 2015, pp. 727f. For default sampling in Stan (4000 samples), the 90% intervals for HDI are more stable than, for instance, 95% intervals. An effective sample size (see nsamples) of at least 10.000 is recommended if 95% intervals should be computed (see Kruschke 2015, p. 183ff).

Credible Intervals

Credible intervals (or uncertainty intervals) are simply the quantiles for a given probability of the posterior draws. See posterior_interval for more details.

MCSE

The Monte Carlo Standard Error is another useful measure of accuracy of the chains. It is defined as standard deviation of the chains divided by their effective sample size (the formula for mcse() is from Kruschke 2015, p. 187). The MCSE “provides a quantitative suggestion of how big the estimation noise is”.

Number of Effective Samples

The effective sample size divides the actual sample size by the amount of autocorrelation. The effective sample size is a measure of “how much independent information there is in autocorrelated chains”, or: “What would be the sample size of a completely non-autocorrelated chain that yielded the same information?” (Kruschke 2015, p182-3). The ratio of effective number of samples and total number of samples (provided in tidy_stan()) ranges from 0 to 1, and should be close to 1. The closer this ratio comes to zero means that the chains may be inefficient, but possibly still okay.

ROPE

There are no fixed rules to set the limits for the region of practical equivalence. However, there are some conventions described by Kruschke (2018) how to specify the limits of the rope. One convention for linear models is to set the limits about .1 SD of the dependent variable around zero (i.e. 0 +/- .1 * sd(y)), where .1 stands for half of a small effect size. Another, more conservative convention to set the ROPE limits is a range of half a standard deviation around zero (see Norman et al. 2003), which indicates a clinical relevant effect (i.e. 0 +/- .25 * sd(y) or even 0 +/- .5 * sd(y))

Test for Practical Equivalence

equi_test() computes the 95%-HDI for x and checks if a model predictor's HDI lies completely outside, completely inside or partially inside the ROPE. If the HDI is completely outside the ROPE, the "null hypothesis" for this parameter is "rejected". If the ROPE completely covers the HDI, i.e. all most credible values of a parameter are inside the region of practical equivalence, the null hypothesis is accepted. Else, it's undecided whether to accept or reject the null hypothesis. In short, desirable results are low proportions inside the ROPE (the closer to zero the better) and the H0 should be rejected.

If neither the rope nor eff_size argument are specified, the effect size (the size of an negligible effect) will be set to 0.1 and the ROPE is 0 +/- .1 * sd(y) for linear models. This is the suggested way to specify the ROPE limits according to Kruschke (2018). For models with binary outcome, there is no direct way to specify the effect size that defines the ROPE limits. Two examples from Kruschke suggest that a negligible change is about .05 on the logit-scale. In these cases, it is recommended to specify the rope argument, however, if not specified, the ROPE limits are caluclated as suggested by Kruschke: The effect size is the probability of "success" for the outcome, divided by pi. For all other models, 0 +/- .1 * sd(intercept) is used to determine the ROPE limits.

If eff_size is specified, but rope is not, then the same formulas apply, except that .1 is replaced by the value in eff_size. If rope is specified, eff_size will be ignored. See also section ROPE in 'Details'.

The advantage of Bayesian testing for practical equivalence over classical frequentist null hypothesis significance testing is that discrete decisions are avoided, “because such decisions encourage people to ignore the magnitude of the parameter value and its uncertainty” (Kruschke (2018)).

Mediation Analysis

mediation() returns a data frame with information on the direct effect (mean value of posterior samples from treatment of the outcome model), mediator effect (mean value of posterior samples from mediator of the outcome model), indirect effect (mean value of the multiplication of the posterior samples from mediator of the outcome model and the posterior samples from treatment of the mediation model) and the total effect (mean value of sums of posterior samples used for the direct and indirect effect). The proportion mediated is the indirect effect divided by the total effect.

For all values, the 90% HDIs are calculated by default. Use prob to calculate a different interval.

The arguments treatment and mediator do not necessarily need to be specified. If missing, mediation() tries to find the treatment and mediator variable automatically. If this does not work, specify these variables.

References

Kruschke JK. Doing Bayesian Data Analysis: A Tutorial with R, JAGS, and Stan. 2nd edition. Academic Press, 2015

Kruschke JK. Rejecting or Accepting Parameter Values in Bayesian Estimation. Advances in Methods and Practices in Psychological Science. 2018; 10.1177/2515245918771304

Norman GR, Sloan JA, Wyrwich KW. Interpretation of Changes in Health-related Quality of Life: The Remarkable Universality of Half a Standard Deviation. Medical Care. 2003;41: 582-592. 10.1097/01.MLR.0000062554.74615.4C

Examples

Run this code

# NOT RUN {
if (require("rstanarm")) {
  fit <- stan_glm(mpg ~ wt + am, data = mtcars, chains = 1)
  hdi(fit)

  # return multiple intervals
  hdi(fit, prob = c(.5, .7, .9))

  # fit logistic regression model
  fit <- stan_glm(
    vs ~ wt + am,
    data = mtcars,
    family = binomial("logit"),
    chains = 1
  )
  # compute hdi, transform on "odds ratio scale"
  hdi(fit, trans = exp)

  # compute rope, on scale of linear predictor. finds proportion
  # of posterior distribution values between -1 and 1.
  rope(fit, rope = c(-1, 1))

  # compute rope, boundaries as "odds ratios". finds proportion of
  # posterior distribution values, which - after being exponentiated -
  # are between .8 and 1.25 (about -.22 and .22 on linear scale)
  rope(fit, rope = c(.8, 1.25), trans = exp)

  # Test for Practical Equivalence
  equi_test(fit)
  equi_test(fit, out = "plot")
}
# }
# NOT RUN {
# }

Run the code above in your browser using DataLab