BESTmcmc: Generate MCMC samples for posterior distributions

Description

This function is the core of the BEST package. It calls JAGS and passes a description of the model, priors, and data, then retrieves and returns the MCMC samples for the parameters.

Usage

BESTmcmc(y1, y2 = NULL, priors = NULL, doPriorsOnly = FALSE,
  numSavedSteps = 1e+05, thinSteps = 1, burnInSteps = 1000,
	verbose=TRUE, rnd.seed=NULL, parallel=NULL)

Arguments

a numeric vector of data values.

a vector of values for a second group, or NULL if there is only one group of observations.

priors

an optional list of values controlling the priors, see Details.

doPriorsOnly

if TRUE, BESTmcmc returns MCMC chains representing the prior distributions, not the posterior distributions for your data set.

numSavedSteps

the number of MCMC observations to be returned.

thinSteps

thinning rate. If set to n > 1, n steps of the MCMC chain are calculated for each one returned. This is useful if autocorrelation is high and you need to run long chains.

burnInSteps

number of steps to discard as burn-in at the beginning of the chain.

verbose

if FALSE, output to the R Console is suppressed. If chains are run in parallel, the output from JAGS is not displayed in the Console, even if verbose = TRUE.

rnd.seed

a positive integer (or NULL): the seed for the random number generator, used to obtain reproducible samples if required. Values generated in different versions of BEST or different versions of JAGS may differ, even with the same seed.

parallel

if NULL or TRUE and > 3 cores are available, the MCMC chains are run in parallel. (If TRUE and < 4 cores are available, a warning is given.)

Value

An object of class BEST inheriting from data.frame. If two samples are compared, the output has the following columns:

mu1, mu2

simulated observations of center for each population

sigma1, sigma2

simulated observations of scale for each population

simulated observations of normality parameter

while for a single sample, the columns are mu, sigma, nu.

The output has the following attributes:

call

the call to the function.

Rhat

the 'potential scale reduction factor'.

n.eff

sample size adjusted for autocorrelation.

data

a list with elements y1 and y2 containing the original data; y2 may be NULL.

priors

a list with the priors used, if the priors argument is not NULL.

doPriorsOnly

logical, the value of the doPriorsOnly argument.

The package provides print, plot and summary methods for BEST objects.

Details

The function uses a t-distribution to model each sample, and generates vectors of random draws from the posterior distribution of the center (\(\mu\)) and spread or scale (\(\sigma\)) of the distribution, as well as a measure of normality (\(\nu\)). The procedure uses a Bayesian MCMC process implemented in JAGS (Plummer 2003).

\(\mu\) is the population mean, except when \(\nu\) = 1 (which is the Cauchy distribution) or lower, when the mean is undefined.

\(\sigma\) is a good approximation to the standard deviation (SD) for values of \(\nu\) > 20. More exactly the SD is \(\sigma\) * sqrt(\(\nu\)/(\(\nu\) - 2)). For a normal distribution (with \(\nu = \infty\)), SD = \(\sigma\) is exact. The SD is undefined when \(\nu\) = 2 or less.

If priors = NULL, broad priors as described by Kruschke (2013) are used. For \(\mu\), Normal(mean(y), 1000 * sd(y)); for \(\sigma\), Uniform(sd(y)/1000, sd(y) * 1000); for \(\nu\), Exponential(1/29) + 1, with the constraint that nu >= 1. Here y = c(y1, y2). Note that priors = NULL is not equivalent to priors = list().

Alternatively, priors can be a list with elements specifying the priors for one or more parameters: \(\mu\) : population centers have separate normal priors, with mean muM and standard deviation muSD; if not included in the list, default values of muM = mean(y), muSD = sd(y)*5 are used; \(\sigma\) : population scales have separate gamma priors, with mode sigmaMode and standard deviation sigmaSD; defaults are sigmaMode = sd(y), sigmaSD = sd(y)*5; \(\nu\) : the normality parameter has a gamma prior with mean nuMean and standard deviation nuSD; defaults are nuMean = 30, nuSD = 30; versions before 0.4.0 constrained \(\nu\) to be >1.

If there are 2 groups of observations, muM, muSD, sigmaMode, sigmaSD may be vectors of length 2 or scalar; if scalar, the same value is used for each population.

The model is shown in the diagram below.

Derived parameters, including the differences in means or standard deviations, and effect sizes can be obtained from the results of the BESTmcmc run.

The output from BESTmcmc has class BEST, which has print, plot and summary methods. These permit the extraction and display of credible intervals and proportions of the posterior mass above or below values of interest.

References

Kruschke, J K. 2013. Bayesian estimation supersedes the t test. Journal of Experimental Psychology: General 142(2):573-603. doi: 10.1037/a0029146

For the informative priors, see Kruschke's blog post at http://doingbayesiandataanalysis.blogspot.com/2015/04/informed-priors-for-bayesian-comparison.html

For the constraint on \(\nu\), see the blog post at http://doingbayesiandataanalysis.blogspot.com/2015/12/prior-on-df-normality-parameter-in-t.html

Plummer, Martyn (2003). JAGS: A Program for Analysis of Bayesian Graphical Models Using Gibbs Sampling, Proceedings of the 3rd International Workshop on Distributed Statistical Computing (DSC 2003), March 20-22, Vienna, Austria. ISSN 1609-395X

Examples

Run this code

# NOT RUN {
## See examples in BEST-package help.
# }

Run the code above in your browser using DataLab