hdi()
computes the highest density interval for values from
MCMC samples, while cred_int()
computes the credible interval (or
uncertainty interval). rope()
calculates the proportion of a posterior
distribution that lies within a region of practical equivalence.
equi_test()
combines these two functions and performs a
"HDI+ROPE decision rule" (Test for Practical Equivalence) (Kruschke 2018)
to check whether parameter values should be accepted or rejected against
the background of a formulated null hypothesis. n_eff()
calculates
the the number of effective samples (effective sample size). mcse()
returns the Monte Carlo standard error. mediation()
is a short
summary for multivariate-response mediation-models.
hdi(x, ...)# S3 method for stanreg
hdi(x, prob = 0.9, trans = NULL, type = c("fixed",
"random", "all"), ...)
# S3 method for brmsfit
hdi(x, prob = 0.9, trans = NULL, type = c("fixed",
"random", "all"), ...)
cred_int(x, ...)
# S3 method for stanreg
cred_int(x, prob = 0.9, trans = NULL,
type = c("fixed", "random", "all"), ...)
# S3 method for brmsfit
cred_int(x, prob = 0.9, trans = NULL,
type = c("fixed", "random", "all"), ...)
equi_test(x, ...)
# S3 method for stanreg
equi_test(x, rope, eff_size, out = c("txt", "viewer",
"browser", "plot"), ...)
# S3 method for brmsfit
equi_test(x, rope, eff_size, out = c("txt", "viewer",
"browser", "plot"), ...)
mcse(x, ...)
# S3 method for brmsfit
mcse(x, type = c("fixed", "random", "all"), ...)
# S3 method for stanreg
mcse(x, type = c("fixed", "random", "all"), ...)
mediation(x, ...)
# S3 method for brmsfit
mediation(x, treatment, mediator, prob = 0.9,
typical = "median", ...)
n_eff(x, ...)
# S3 method for stanreg
n_eff(x, type = c("fixed", "random", "all"), ...)
# S3 method for brmsfit
n_eff(x, type = c("fixed", "random", "all"), ...)
rope(x, rope, ...)
# S3 method for stanreg
rope(x, rope, trans = NULL, type = c("fixed",
"random", "all"), ...)
# S3 method for brmsfit
rope(x, rope, trans = NULL, type = c("fixed",
"random", "all"), ...)
A stanreg
, stanfit
, or brmsfit
object. For
hdi()
and rope()
, may also be a data frame or a vector
of values from a probability distribution (e.g., posterior probabilities
from MCMC sampling).
Further arguments passed down to equi_test()
when
plot = TRUE
:
colors
:
Color of the density regions for the 95% distribution of the posterior
samples.
rope.color
and rope.alpha
:
Fill color and alpha-value of the ROPE (region of practical equivalence).
x.title
:
Title for the x-axis of the plot.
legend.title
:
Title for the plot legend.
labels
:
Character vector of same length as terms plotted on the y-axis, to
give axis labels user-defined labels.
Vector of scalars between 0 and 1, indicating the mass within
the credible interval that is to be estimated. See hdi
.
Name of a function or character vector naming a function, used
to apply transformations on the returned HDI-values resp.
(for rope()
) on the values of the posterior distribution, before
calculating the rope based on the boundaries given in rope
. Note
that the values in rope
are not transformed.
For mixed effects models, specify the type of effects that should
be returned. type = "fixed"
returns fixed effects only,
type = "random"
the random effects and type = "all"
returns
both fixed and random effects.
Vector of length two, indicating the lower and upper limit of a range around zero, which indicates the region of practical equivalence. Values of the posterior distribution within this range are considered as being "practically equivalent to zero".
A scalar indicating the effect size (the size of an negligible
effect) that is used to calculate the limits of the ROPE for the test of
practical equivalence. If not specified, an effect size of .1 is used for
linear models, as suggested by Kruschke 2018 (see 'Details').
If rope
is specified, this argument will be ignored.
Character vector, indicating whether the results should be printed
to console (out = "txt"
) or as HTML-table in the viewer-pane
(out = "viewer"
) or browser (out = "browser"
), of if the
results should be plotted (out = "plot"
, only applies to certain
functions). May be abbreviated.
Character, name of the treatment variable (or direct effect)
in a (multivariate response) mediator-model. If missing, mediation()
tries to find the treatment variable automatically, however, this may fail.
Character, name of the mediator variable in a (multivariate
response) mediator-model. If missing, mediation()
tries to find the
treatment variable automatically, however, this may fail.
The typical value that will represent the Bayesian point estimate.
By default, the posterior median is returned. See typical_value
for possible values for this argument.
For hdi()
, if x
is a vector, returns a vector of length
two with the lower and upper limit of the HDI; if x
is a
stanreg
, stanfit
or brmsfit
object, returns a
tibble with lower and upper HDI-limits for each predictor. To distinguish
multiple HDI values, column names for the HDI get a suffix when prob
has more than one element.
For rope()
, returns a tibble with two columns: the proportion of
values from x
that are within and outside the boundaries of
rope
.
equi_test()
returns a tibble with a column decision
that
indicates whether or not a parameter value is accepted/rejected;
inside.rope
, which indicates the proportion of the whole posterior
distribution that lies inside the ROPE (not just the proportion of
values from the 95% HDI); and the lower and upper interval from the 95%-HDI.
mcse()
and n_eff()
return a tibble with two columns: one
with the term names and one with the related statistic resp. effective
sample size.
mediation()
returns a data frame with direct, indirect, mediator and
total effect of a multivariate-response mediation-model, as well as the
proportion mediated. The effect sizes are mean values of the posterior
samples.
Computation for HDI is based on the code from Kruschke 2015, pp. 727f.
For default sampling in Stan (4000 samples), the 90% intervals for HDI are
more stable than, for instance, 95% intervals. An effective sample size
(see nsamples
) of at least 10.000 is recommended if
95% intervals should be computed (see Kruschke 2015, p. 183ff).
Credible intervals (or uncertainty intervals) are simply the quantiles
for a given probability of the posterior draws. See
posterior_interval
for more details.
The Monte Carlo Standard Error is another useful measure of accuracy of
the chains. It is defined as standard deviation of the chains divided by
their effective sample size (the formula for mcse()
is from
Kruschke 2015, p. 187). The MCSE “provides a quantitative suggestion
of how big the estimation noise is”.
The effective sample size divides the actual sample size by the amount
of autocorrelation. The effective sample size is a measure of “how
much independent information there is in autocorrelated chains”, or:
“What would be the sample size of a completely non-autocorrelated chain
that yielded the same information?” (Kruschke 2015, p182-3).
The ratio of effective number of samples and total number of samples
(provided in tidy_stan()
) ranges from 0 to 1, and should be close
to 1. The closer this ratio comes to zero means that the chains may be
inefficient, but possibly still okay.
There are no fixed rules to set the limits for the region of practical
equivalence. However, there are some conventions described by
Kruschke (2018) how to specify the limits of the rope. One
convention for linear models is to set the limits about .1 SD of the
dependent variable around zero (i.e. 0 +/- .1 * sd(y)
), where
.1 stands for half of a small effect size. Another, more conservative
convention to set the ROPE limits is a range of half a standard
deviation around zero (see Norman et al. 2003), which indicates
a clinical relevant effect (i.e. 0 +/- .25 * sd(y)
or even
0 +/- .5 * sd(y)
)
equi_test()
computes the 95%-HDI for x
and checks if a
model predictor's HDI lies completely outside, completely inside or
partially inside the ROPE. If the HDI is completely outside the ROPE,
the "null hypothesis" for this parameter is "rejected". If the ROPE
completely covers the HDI, i.e. all most credible values of a parameter
are inside the region of practical equivalence, the null hypothesis
is accepted. Else, it's undecided whether to accept or reject the
null hypothesis. In short, desirable results are low proportions inside
the ROPE (the closer to zero the better) and the H0 should be rejected.
If neither the rope
nor eff_size
argument are specified,
the effect size (the size of an negligible effect) will be set to 0.1
and the ROPE is 0 +/- .1 * sd(y)
for linear models. This is the
suggested way to specify the ROPE limits according to Kruschke (2018).
For models with binary outcome, there is no direct way to specify the
effect size that defines the ROPE limits. Two examples from Kruschke
suggest that a negligible change is about .05 on the logit-scale.
In these cases, it is recommended to specify the rope
argument,
however, if not specified, the ROPE limits are caluclated as suggested
by Kruschke: The effect size is the probability of "success" for the
outcome, divided by pi
. For all other models,
0 +/- .1 * sd(intercept)
is used to determine the ROPE limits.
If eff_size
is specified, but rope
is not, then
the same formulas apply, except that .1
is replaced by the
value in eff_size
. If rope
is specified, eff_size
will be ignored. See also section ROPE in 'Details'.
The advantage of Bayesian testing for practical equivalence over classical frequentist null hypothesis significance testing is that discrete decisions are avoided, “because such decisions encourage people to ignore the magnitude of the parameter value and its uncertainty” (Kruschke (2018)).
mediation()
returns a data frame with information on the
direct effect (mean value of posterior samples from treatment
of the outcome model), mediator effect (mean value of posterior
samples from mediator
of the outcome model), indirect effect
(mean value of the multiplication of the posterior samples from
mediator
of the outcome model and the posterior samples from
treatment
of the mediation model) and the total effect (mean
value of sums of posterior samples used for the direct and indirect
effect). The proportion mediated is the indirect effect divided
by the total effect.
For all values, the 90% HDIs are calculated by default. Use prob
to calculate a different interval.
The arguments treatment
and mediator
do not necessarily
need to be specified. If missing, mediation()
tries to find the
treatment and mediator variable automatically. If this does not work,
specify these variables.
Kruschke JK. Doing Bayesian Data Analysis: A Tutorial with R, JAGS, and Stan. 2nd edition. Academic Press, 2015
Kruschke JK. Rejecting or Accepting Parameter Values in Bayesian Estimation. Advances in Methods and Practices in Psychological Science. 2018; 10.1177/2515245918771304
Norman GR, Sloan JA, Wyrwich KW. Interpretation of Changes in Health-related Quality of Life: The Remarkable Universality of Half a Standard Deviation. Medical Care. 2003;41: 582-592. 10.1097/01.MLR.0000062554.74615.4C
# NOT RUN {
if (require("rstanarm")) {
fit <- stan_glm(mpg ~ wt + am, data = mtcars, chains = 1)
hdi(fit)
# return multiple intervals
hdi(fit, prob = c(.5, .7, .9))
# fit logistic regression model
fit <- stan_glm(
vs ~ wt + am,
data = mtcars,
family = binomial("logit"),
chains = 1
)
# compute hdi, transform on "odds ratio scale"
hdi(fit, trans = exp)
# compute rope, on scale of linear predictor. finds proportion
# of posterior distribution values between -1 and 1.
rope(fit, rope = c(-1, 1))
# compute rope, boundaries as "odds ratios". finds proportion of
# posterior distribution values, which - after being exponentiated -
# are between .8 and 1.25 (about -.22 and .22 on linear scale)
rope(fit, rope = c(.8, 1.25), trans = exp)
# Test for Practical Equivalence
equi_test(fit)
equi_test(fit, out = "plot")
}
# }
# NOT RUN {
# }
Run the code above in your browser using DataLab