metrop: Metropolis Algorithm

Description

Markov chain Monte Carlo for continuous random vector using a Metropolis algorithm.

Usage

metrop(obj, initial, nbatch, blen = 1, nspac = 1, scale = 1, outfun,
    debug = FALSE, ...)
# S3 method for function
metrop(obj, initial, nbatch, blen = 1, nspac = 1,
    scale = 1, outfun, debug = FALSE, ...)
# S3 method for metropolis
metrop(obj, initial, nbatch, blen = 1, nspac = 1,
    scale = 1, outfun, debug = FALSE, ...)

Value

an object of class "mcmc", subclass "metropolis", which is a list containing at least the following components:

accept: fraction of Metropolis proposals accepted.
batch: nbatch by p matrix, the batch means, where p is the dimension of the result of outfun if outfun is a function, otherwise the dimension of state[outfun] if that makes sense, and the dimension of state when outfun is missing.
accept.batch: a vector of length nbatch, the batch means of the acceptances.
initial: value of argument initial.
final: final state of Markov chain.
initial.seed: value of .Random.seed before the run.
final.seed: value of .Random.seed after the run.
time: running time of Markov chain from system.time().
lud: the function used to calculate log unnormalized density, either obj or obj$lud from a previous run.
nbatch: the argument nbatch or obj$nbatch.
blen: the argument blen or obj$blen.
nspac: the argument nspac or obj$nspac.
outfun: the argument outfun or obj$outfun.

Description of additional output when debug = TRUE can be found in the vignette debug (../doc/debug.pdf).

Arguments

obj

Either an R function or an object of class "metropolis" from a previous invocation of this function.

If a function, it evaluates the log unnormalized probability density of the desired equilibrium distribution of the Markov chain. Its first argument is the state vector of the Markov chain. Other arguments arbitrary and taken from the ... arguments of this function. It should return -Inf for points of the state space having probability zero under the desired equilibrium distribution. See also Details and Warning.

If an object of class "metropolis", any missing arguments (including the log unnormalized density function) are taken from this object. Also initial is ignored and the initial state of the Markov chain is the final state from the run recorded in obj.

initial

a real vector, the initial state of the Markov chain. Must be feasible, see Details. Ignored if obj is of class "metropolis".

nbatch

the number of batches.

blen

the length of batches.

nspac

the spacing of iterations that contribute to batches.

scale

controls the proposal step size. If scalar or vector, the proposal is x + scale * z where x is the current state and z is a standard normal random vector. If matrix, the proposal is x + scale %*% z.

outfun

controls the output. If a function, then the batch means of outfun(state, ...) are returned. If a numeric or logical vector, then the batch means of state[outfun] (if this makes sense). If missing, the the batch means of state are returned.

debug

if TRUE extra output useful for testing.

...

additional arguments for obj or outfun.

Warning

If outfun is missing or not a function, then the log unnormalized density can be defined without a ... argument and that works fine. One can define it starting ludfun <- function(state) and that works or ludfun <- function(state, foo, bar), where foo and bar are supplied as additional arguments to metrop.

If outfun is a function, then both it and the log unnormalized density function can be defined without ... arguments if they have exactly the same arguments list and that works fine. Otherwise it doesn't work. Define these functions by


ludfun <- function(state, foo)
outfun <- function(state, bar)

and you get an error about unused arguments. Instead define these functions by


ludfun <- function(state, foo, \ldots)
outfun <- function(state, bar, \ldots)

and supply foo and bar as additional arguments to metrop, and that works fine.

In short, the log unnormalized density function and outfun need to have ... in their arguments list to be safe. Sometimes it works when ... is left out and sometimes it doesn't.

Of course, one can avoid this whole issue by always defining the log unnormalized density function and outfun to have only one argument state and use global variables (objects in the R global environment) to specify any other information these functions need to use. That too follows the R way. But some people consider that bad programming practice.

A third option is to define either or both of these functions using a function factory. This is demonstrated in the vignette for this package named demo, which is shown by vignette("demo", "mcmc").

Philosophy of MCMC

This function follows the philosophy of MCMC explained the introductory chapter of the Handbook of Markov Chain Monte Carlo (Geyer, 2011).

This function automatically does batch means in order to reduce the size of output and to enable easy calculation of Monte Carlo standard errors (MCSE), which measure error due to the Monte Carlo sampling (not error due to statistical sampling --- MCSE gets smaller when you run the computer longer, but statistical sampling variability only gets smaller when you get a larger data set). All of this is explained in the package vignette vignette("demo", "mcmc") and in Section 1.10 of Geyer (2011).

This function does not apparently do “burn-in” because this concept does not actually help with MCMC (Geyer, 2011, Section 1.11.4) but the re-entrant property of this function does allow one to do “burn-in” if one wants. Assuming ludfun, start.value, scale have been already defined and are, respectively, an R function coding the log unnormalized density of the target distribution, a valid state of the Markov chain, and a useful scale factor,


out <- metrop(ludfun, start.value, nbatch = 1, blen = 1e5, scale = scale)
out <- metrop(out, nbatch = 100, blen = 1000)

throws away a run of 100 thousand iterations before doing another run of 100 thousand iterations that is actually useful for analysis, for example,


apply(out$batch, 2, mean)
apply(out$batch, 2, sd) / sqrt(out$nbatch)

give estimates of posterior means and their MCSE assuming the batch length (here 1000) was long enough to contain almost all of the significant autocorrelation (see Geyer, 2011, Section 1.10, for more on MCSE). The re-entrant property of this function (the second run starts where the first one stops) assures that this is really “burn-in”.

The re-entrant property allows one to do very long runs without having to do them in one invocation of this function.


out2 <- metrop(out)
out3 <- metrop(out2)
batch <- rbind(out$batch, out2$batch, out3$batch)

produces a result as if the first run had been three times as long.

Tuning

The scale argument must be adjusted so that the acceptance rate is not too low or too high to get reasonable performance. The rule of thumb is that the acceptance rate should be about 25%. But this recommendation (Gelman, et al., 1996) is justified by analysis of a toy problem (simulating a spherical multivariate normal distribution) for which MCMC is unnecessary. There is no reason to believe this is optimal for all problems (if it were optimal, a stronger theorem could be proved). Nevertheless, it is clear that at very low acceptance rates the chain makes little progress (because in most iterations it does not move) and that at very high acceptance rates the chain also makes little progress (because unless the log unnormalized density is nearly constant, very high acceptance rates can only be achieved by very small values of scale so the steps the chain takes are also very small).

Even in the Gelman, et al. (1996) result, the optimal rate for spherical multivariate normal depends on dimension. It is 44% for $d = 1$ and 23% for $d = \infty$. Geyer and Thompson (1995) have an example, admittedly for simulated tempering (see temper) rather than random-walk Metropolis, in which no acceptance rate less than 70% produces an ergodic Markov chain. Thus 25% is merely a rule of thumb. We only know we don't want too high or too low. Probably 1% or 99% is very inefficient.

Details

Runs a “random-walk” Metropolis algorithm, terminology introduced by Tierney (1994), with multivariate normal proposal producing a Markov chain with equilibrium distribution having a specified unnormalized density. Distribution must be continuous. Support of the distribution is the support of the density specified by argument obj. The initial state must satisfy obj(state, ...) > -Inf. Description of a complete MCMC analysis (Bayesian logistic regression) using this function can be found in the vignette vignette("demo", "mcmc").

Suppose the function coded by the log unnormalized function (either obj or obj$lud) is actually a log unnormalized density, that is, if $w$ denotes that function, then $e^w$ integrates to some value strictly between zero and infinity. Then the metrop function always simulates a reversible, Harris ergodic Markov chain having the equilibrium distribution with this log unnormalized density. The chain is not guaranteed to be geometrically ergodic. In fact it cannot be geometrically ergodic if the tails of the log unnormalized density are sufficiently heavy. The morph.metrop function deals with this situation.

References

Gelman, A., Roberts, G. O., and Gilks, W. R. (1996) Efficient Metropolis jumping rules. In Bayesian Statistics 5: Proceedings of the Fifth Valencia International Meeting. Edited by J. M. Bernardo, J. O. Berger, A. P. Dawid, and A. F. M. Smith. Oxford University Press, Oxford, pp. 599--607.

Geyer, C. J. (2011) Introduction to MCMC. In Handbook of Markov Chain Monte Carlo. Edited by S. P. Brooks, A. E. Gelman, G. L. Jones, and X. L. Meng. Chapman & Hall/CRC, Boca Raton, FL, pp. 3--48.

Geyer, C. J. and Thompson, E. A. (1995) Annealing Markov chain Monte Carlo with applications to ancestral inference. Journal of the American Statistical Association 90 909--920.

Tierney, L. (1994) Markov chains for exploring posterior distributions (with discussion). Annals of Statistics 22 1701--1762.

Examples

Run this code

h <- function(x) if (all(x >= 0) && sum(x) <= 1) return(1) else return(-Inf)
out <- metrop(h, rep(0, 5), 1000)
out$accept
# acceptance rate too low
out <- metrop(out, scale = 0.1)
out$accept
t.test(out$accept.batch)$conf.int
# acceptance rate o. k. (about 25 percent)
plot(out$batch[ , 1])
# but run length too short (few excursions from end to end of range)
out <- metrop(out, nbatch = 1e4)
out$accept
plot(out$batch[ , 1])
hist(out$batch[ , 1])
acf(out$batch[ , 1], lag.max = 250)
# looks like batch length of 250 is perhaps OK
out <- metrop(out, blen = 250, nbatch = 100)
apply(out$batch, 2, mean) # Monte Carlo estimates of means
apply(out$batch, 2, sd) / sqrt(out$nbatch) # Monte Carlo standard errors
t.test(out$accept.batch)$conf.int
acf(out$batch[ , 1]) # appears that blen is long enough

Run the code above in your browser using DataLab