Learn R Programming

⚠️There's a newer version (0.15.2) of this package.Take me there.

bayestestR

Become a Bayesian master you will

bayestestR is a lightweight package providing utilities to describe posterior distributions and Bayesian models.

Installation

Run the following:

install.packages("devtools")
devtools::install_github("easystats/bayestestR")
library("bayestestR")

Documentation

Click on the buttons above to access the package documentation and the easystats blog, and check-out these vignettes:

Tutorials

Articles

Features

describe_posterior() is the master function with which you can compute all of the indices cited below at once.

describe_posterior(rnorm(1000))

Point-estimates

MAP Estimate

map_estimate() find the Highest Maximum A Posteriori (MAP) estimate of a posterior, i.e., the most probable value.

map_estimate(rnorm(1000, 1, 1))

Uncertainty

Highest Density Interval (HDI) - The Credible Interval (CI)

hdi() computes the Highest Density Interval (HDI) of a posterior distribution, i.e., the interval which contains all points within the interval have a higher probability density than points outside the interval. The HDI can be used in the context of Bayesian posterior characterisation as Credible Interval (CI).

Unlike equal-tailed intervals (see ci) that typically exclude 2.5% from each tail of the distribution, the HDI is not equal-tailed and therefore always includes the mode(s) of posterior distributions.

By default, hdi() returns the 89% intervals (ci = 0.89), deemed to be more stable than, for instance, 95% intervals (Kruschke, 2014). An effective sample size of at least 10.000 is recommended if 95% intervals should be computed (Kruschke 2014, p. 183ff). Moreover, 89 is the highest prime number that does not exceed the already unstable 95% threshold (McElreath, 2015).

hdi(rnorm(1000), ci = .89)

Null-Hypothesis Significance Testing (NHST)

ROPE

rope() computes the proportion (in percentage) of the HDI (default to the 89% HDI) of a posterior distribution that lies within a region of practical equivalence.

Statistically, the probability of a posterior distribution of being different from 0 does not make much sense (the probability of it being different from a single point being infinite). Therefore, the idea underlining ROPE is to let the user define an area around the null value enclosing values that are equivalent to the null value for practical purposes (Kruschke 2010, 2011, 2014).

Kruschke (2018) suggests that such null value could be set, by default, to the -0.1 to 0.1 range of a standardized parameter (negligible effect size according to Cohen, 1988). This could be generalized: For instance, for linear models, the ROPE could be set as 0 +/- .1 * sd(y). This ROPE range can be automatically computed for models using the rope_range function.

Kruschke (2010, 2011, 2014) suggests using the proportion of the 95% (or 90%, considered more stable) HDI that falls within the ROPE as an index for “null-hypothesis” testing (as understood under the Bayesian framework, see equivalence_test).

rope(rnorm(1000, 1, 1), range = c(-0.1, 0.1))

Equivalence test

equivalence_test() a Test for Practical Equivalence based on the “HDI+ROPE decision rule” (Kruschke, 2018) to check whether parameter values should be accepted or rejected against an explicitly formulated “null hypothesis” (i.e., a ROPE).

equivalence_test(rnorm(1000, 1, 1), range = c(-0.1, 0.1))

Probability of Direction (pd)

p_direction() computes the Probability of Direction (pd, also known as the Maximum Probability of Effect - MPE). It varies between 50% and 100% and can be interpreted as the probability (expressed in percentage) that a parameter (described by its posterior distribution) is strictly positive or negative (whichever is the most probable). It is mathematically defined as the proportion of the posterior distribution that is of the median’s sign. Although differently expressed, this index is fairly similar (i.e., is strongly correlated) to the frequentist p-value.

Relationship with the p-value: In most cases, it seems that the pd corresponds to the frequentist one-sided p-value through the formula p-value = (1-pd/100) and to the two-sided p-value (the most commonly reported) through the formula p-value = 2*(1-pd/100). Thus, a pd of 95%, 97.5% 99.5% and 99.95% corresponds approximately to a two-sided p-value of respectively .1, .05, .01 and .001. See the reporting guidelines.

p_direction(rnorm(1000, mean = 1, sd = 1))

Bayes Factor

bayesfactor_savagedickey() computes the ratio between the density of a single value (typically the null) in two distributions, typically the posterior vs. the prior distributions. This method is used to examine if the hypothesis value is less or more likely given the observed data.

prior <- rnorm(1000, mean = 0, sd = 1)
posterior <- rnorm(1000, mean = 1, sd = 0.7)

bayesfactor_savagedickey(posterior, prior, direction = "two-sided", hypothesis = 0)

MAP-based p-value

p_map() computes a Bayesian equivalent of the p-value, related to the odds that a parameter (described by its posterior distribution) has against the null hypothesis (h0) using Mills’ (2014, 2017) Objective Bayesian Hypothesis Testing framework. It is mathematically based on the density at the Maximum A Priori (MAP) and corresponds to the density value at 0 divided by the density of the MAP estimate.

p_map(rnorm(1000, 1, 1))

Utilities

Find ROPE’s appropriate range

rope_range(): This function attempts at automatically finding suitable “default” values for the Region Of Practical Equivalence (ROPE). Kruschke (2018) suggests that such null value could be set, by default, to a range from -0.1 to 0.1 of a standardized parameter (negligible effect size according to Cohen, 1988), which can be generalised for linear models to -0.1 * sd(y), 0.1 * sd(y). For logistic models, the parameters expressed in log odds ratio can be converted to standardized difference through the formula sqrt(3)/pi, resulting in a range of -0.05 to -0.05.

rope_range(model)

Density Estimation

estimate_density(): This function is a wrapper over different methods of density estimation. By default, it uses the base R density with by default uses a different smoothing bandwidth ("SJ") from the legacy default implemented the base R density function ("nrd0"). However, Deng & Wickham suggest that method = "KernSmooth" is the fastest and the most accurate.

Perfect Distributions

distribution(): Generate a sample of size n with near-perfect distributions.

distribution(n = 10)

Probability of a Value

density_at(): Compute the density of a given point of a distribution.

density_at(rnorm(1000, 1, 1), 1)

Credits

You can cite the package as following:

Copy Link

Version

Install

install.packages('bayestestR')

Monthly Downloads

85,027

Version

0.2.0

License

GPL-3

Issues

Pull Requests

Stars

Forks

Maintainer

Dominique Makowski

Last Published

May 29th, 2019

Functions in bayestestR (0.2.0)

equivalence_test

Test for Practical Equivalence
estimate_density

Density Estimation
bayesfactor_inclusion

Inclusion Bayes Factors for effects across Bayesian models
p_map

Bayesian p-value based on the density at the Maximum A Priori (MAP)
p_rope

ROPE-based p-value
as.data.frame.density

Coerce to a Data Frame
hdi

Highest Density Interval (HDI)
point_estimate

Point-estimates of posterior distributions
rope

Region of Practical Equivalence (ROPE)
bayesfactor_models

Bayes Factors (BF) for model comparison
map_estimate

Maximum A Posteriori (MAP) Estimate
describe_prior

Describe Priors
rope_range

Find Default Equivalence (ROPE) Region Bounds
diagnostic_posterior

Posteriors Sampling Diagnostic
density_at

Probability of a Given Point
update.bayesfactor_models

Update bayesfactor_models
describe_posterior

Describe Posterior Distributions
.select_nums

select numerics columns
p_direction

Probability of Direction (pd)
effective_sample

Effective Sample Size (ESS)
mcse

Monte-Carlo Standard Error (MCSE)
as.numeric.p_direction

Convert to Numeric
bayesfactor

Bayes Factors (BF)
ci

Confidence/Credible Interval
bayesfactor_savagedickey

Savage-Dickey density ratio Bayes Factor (BF)
area_under_curve

Area under the Curve (AUC)
distribution

Empirical Distributions
.flatten_list

Flatten a list