Markov chain Monte Carlo for continuous random vector using a Metropolis algorithm.
metrop(obj, initial, nbatch, blen = 1, nspac = 1, scale = 1, outfun,
debug = FALSE, ...)
# S3 method for function
metrop(obj, initial, nbatch, blen = 1, nspac = 1,
scale = 1, outfun, debug = FALSE, ...)
# S3 method for metropolis
metrop(obj, initial, nbatch, blen = 1, nspac = 1,
scale = 1, outfun, debug = FALSE, ...)
an object of class "mcmc"
, subclass "metropolis"
,
which is a list containing at least the following components:
fraction of Metropolis proposals accepted.
nbatch
by p
matrix, the batch means, where
p
is the dimension of the result of outfun
if outfun
is a function, otherwise the dimension of
state[outfun]
if that makes sense, and the dimension
of state
when outfun
is missing.
a vector of length nbatch
, the batch means
of the acceptances.
value of argument initial
.
final state of Markov chain.
value of .Random.seed
before the run.
value of .Random.seed
after the run.
running time of Markov chain from system.time()
.
the function used to calculate log unnormalized density,
either obj
or obj$lud
from a previous run.
the argument nbatch
or obj$nbatch
.
the argument blen
or obj$blen
.
the argument nspac
or obj$nspac
.
the argument outfun
or obj$outfun
.
Description of additional output when debug = TRUE
can be
found in the vignette debug
(../doc/debug.pdf).
Either an R function or an object of class "metropolis"
from a previous invocation of this function.
If a function, it evaluates the log unnormalized probability
density of the desired equilibrium distribution of the Markov chain.
Its first argument is the state vector of the Markov chain. Other
arguments arbitrary and taken from the ...
arguments of this
function.
It should return -Inf
for points of the state space having
probability zero under the desired equilibrium distribution.
See also Details and Warning.
If an object of class "metropolis"
, any missing arguments
(including the log unnormalized density function) are taken from
this object. Also initial
is ignored and the initial state
of the Markov chain is the final state from the run recorded in
obj
.
a real vector, the initial state of the Markov chain.
Must be feasible, see Details. Ignored if obj
is of
class "metropolis"
.
the number of batches.
the length of batches.
the spacing of iterations that contribute to batches.
controls the proposal step size. If scalar or
vector, the proposal is x + scale * z
where x
is
the current state and z
is a standard normal random vector.
If matrix, the proposal is x + scale %*% z
.
controls the output. If a function, then the batch means
of outfun(state, ...)
are returned. If a numeric
or logical vector, then the batch means of state[outfun]
(if this makes sense). If missing, the the batch means
of state
are returned.
if TRUE
extra output useful for testing.
additional arguments for obj
or outfun
.
If outfun
is missing or not a function, then the log unnormalized
density can be defined without a ... argument and that works fine.
One can define it starting ludfun <- function(state)
and that works
or ludfun <- function(state, foo, bar)
, where foo
and bar
are supplied as additional arguments to metrop
.
If outfun
is a function, then both it and the log unnormalized
density function can be defined without ... arguments if they
have exactly the same arguments list and that works fine. Otherwise it
doesn't work. Define these functions by
ludfun <- function(state, foo)
outfun <- function(state, bar)
and you get an error about unused arguments. Instead define these functions by
ludfun <- function(state, foo, \ldots)
outfun <- function(state, bar, \ldots)
and supply foo
and bar
as additional arguments to metrop
,
and that works fine.
In short, the log unnormalized density function and outfun
need
to have ... in their arguments list to be safe. Sometimes it works
when ... is left out and sometimes it doesn't.
Of course, one can avoid this whole issue by always defining the log
unnormalized density function and outfun
to have only one argument
state
and use global variables (objects in the R global environment) to
specify any other information these functions need to use. That too
follows the R way. But some people consider that bad programming practice.
A third option is to define either or both of these functions using a function
factory. This is demonstrated in the vignette for this package named
demo
, which is shown by vignette("demo", "mcmc")
.
This function follows the philosophy of MCMC explained the introductory chapter of the Handbook of Markov Chain Monte Carlo (Geyer, 2011).
This function automatically does batch means in order to reduce
the size of output and to enable easy calculation of Monte Carlo standard
errors (MCSE), which measure error due to the Monte Carlo sampling (not
error due to statistical sampling --- MCSE gets smaller when you run the
computer longer, but statistical sampling variability only gets smaller
when you get a larger data set). All of this is explained in the package
vignette vignette("demo", "mcmc")
and in Section 1.10 of Geyer (2011).
This function does not apparently
do “burn-in” because this concept does not actually help with MCMC
(Geyer, 2011, Section 1.11.4) but the re-entrant property of this
function does allow one to do “burn-in” if one wants.
Assuming ludfun
, start.value
, scale
have been already defined
and are, respectively, an R function coding the log unnormalized density
of the target distribution, a valid state of the Markov chain,
and a useful scale factor,
out <- metrop(ludfun, start.value, nbatch = 1, blen = 1e5, scale = scale)
out <- metrop(out, nbatch = 100, blen = 1000)
throws away a run of 100 thousand iterations before doing another run of 100 thousand iterations that is actually useful for analysis, for example,
apply(out$batch, 2, mean)
apply(out$batch, 2, sd) / sqrt(out$nbatch)
give estimates of posterior means and their MCSE assuming the batch length (here 1000) was long enough to contain almost all of the significant autocorrelation (see Geyer, 2011, Section 1.10, for more on MCSE). The re-entrant property of this function (the second run starts where the first one stops) assures that this is really “burn-in”.
The re-entrant property allows one to do very long runs without having to do them in one invocation of this function.
out2 <- metrop(out)
out3 <- metrop(out2)
batch <- rbind(out$batch, out2$batch, out3$batch)
produces a result as if the first run had been three times as long.
The scale
argument must be adjusted so that the acceptance rate
is not too low or too high to get reasonable performance. The rule of
thumb is that the acceptance rate should be about 25%.
But this recommendation (Gelman, et al., 1996) is justified by analysis
of a toy problem (simulating a spherical multivariate normal distribution)
for which MCMC is unnecessary. There is no reason to believe this is optimal
for all problems (if it were optimal, a stronger theorem could be proved).
Nevertheless, it is clear that at very low acceptance rates the chain makes
little progress (because in most iterations it does not move) and that at
very high acceptance rates the chain also makes little progress (because
unless the log unnormalized density is nearly constant, very high acceptance
rates can only be achieved by very small values of scale
so the
steps the chain takes are also very small).
Even in the Gelman, et al. (1996) result, the optimal rate for spherical
multivariate normal depends on dimension. It is 44% for \(d = 1\)
and 23% for \(d = \infty\).
Geyer and Thompson (1995) have an example, admittedly for
simulated tempering (see temper
) rather than random-walk
Metropolis, in which no acceptance rate less than 70% produces an ergodic
Markov chain. Thus 25% is merely a rule of thumb. We only know we don't
want too high or too low. Probably 1% or 99% is very inefficient.
Runs a “random-walk” Metropolis algorithm, terminology introduced
by Tierney (1994), with multivariate normal proposal
producing a Markov chain with equilibrium distribution having a specified
unnormalized density. Distribution must be continuous. Support of the
distribution is the support of the density specified by argument obj
.
The initial state must satisfy obj(state, ...) > -Inf
.
Description of a complete MCMC analysis (Bayesian logistic regression)
using this function can be found in the vignette
vignette("demo", "mcmc")
.
Suppose the function coded by the log unnormalized function (either
obj
or obj$lud
) is actually a log unnormalized density,
that is, if \(w\) denotes that function, then \(e^w\) integrates
to some value strictly between zero and infinity. Then the metrop
function always simulates a reversible, Harris ergodic Markov chain having
the equilibrium distribution with this log unnormalized density.
The chain is not guaranteed to be geometrically ergodic. In fact it cannot
be geometrically ergodic if the tails of the log unnormalized density are
sufficiently heavy. The morph.metrop
function deals with this
situation.
Gelman, A., Roberts, G. O., and Gilks, W. R. (1996) Efficient Metropolis jumping rules. In Bayesian Statistics 5: Proceedings of the Fifth Valencia International Meeting. Edited by J. M. Bernardo, J. O. Berger, A. P. Dawid, and A. F. M. Smith. Oxford University Press, Oxford, pp. 599--607.
Geyer, C. J. (2011) Introduction to MCMC. In Handbook of Markov Chain Monte Carlo. Edited by S. P. Brooks, A. E. Gelman, G. L. Jones, and X. L. Meng. Chapman & Hall/CRC, Boca Raton, FL, pp. 3--48.
Geyer, C. J. and Thompson, E. A. (1995) Annealing Markov chain Monte Carlo with applications to ancestral inference. Journal of the American Statistical Association 90 909--920.
Tierney, L. (1994) Markov chains for exploring posterior distributions (with discussion). Annals of Statistics 22 1701--1762.
morph.metrop
and temper
h <- function(x) if (all(x >= 0) && sum(x) <= 1) return(1) else return(-Inf)
out <- metrop(h, rep(0, 5), 1000)
out$accept
# acceptance rate too low
out <- metrop(out, scale = 0.1)
out$accept
t.test(out$accept.batch)$conf.int
# acceptance rate o. k. (about 25 percent)
plot(out$batch[ , 1])
# but run length too short (few excursions from end to end of range)
out <- metrop(out, nbatch = 1e4)
out$accept
plot(out$batch[ , 1])
hist(out$batch[ , 1])
acf(out$batch[ , 1], lag.max = 250)
# looks like batch length of 250 is perhaps OK
out <- metrop(out, blen = 250, nbatch = 100)
apply(out$batch, 2, mean) # Monte Carlo estimates of means
apply(out$batch, 2, sd) / sqrt(out$nbatch) # Monte Carlo standard errors
t.test(out$accept.batch)$conf.int
acf(out$batch[ , 1]) # appears that blen is long enough
Run the code above in your browser using DataLab