For mcmc()
if verbose = TRUE
, the progress bar shows
the number of iterations so far and the expected time to complete the phase
of model fitting (warmup or sampling). Occasionally, a proposed set of
parameters can cause numerical instability (I.e. the log density or its
gradient is NA
, Inf
or -Inf
); normally because the log
joint density is so low that it can't be represented as a floating point
number. When this happens, the progress bar will also display the
proportion of proposals so far that were 'bad' (numerically unstable) and
therefore rejected. Some numerical instability during the warmup phase is
normal, but 'bad' samples during the sampling phase can lead to bias in
your posterior sample. If you only have a few bad samples (<10\%), you can
usually resolve this with a longer warmup period or by manually defining
starting values to move the sampler into a more reasonable part of the
parameter space. If you have more samples than that, it may be that your
model is misspecified. You can often diagnose this by using
calculate()
to evaluate the values of greta arrays, given
fixed values of model parameters, and checking the results are what you
expect.
greta runs multiple chains simultaneously with a single sampler,
vectorising all operations across the chains. E.g. a scalar addition in
your model is computed as an elementwise vector addition (with vectors
having length chains
), a vector addition is computed as a matrix
addition etc. TensorFlow is able to parallelise these operations, and this
approach reduced computational overheads, so this is the most efficient of
computing on multiple chains.
Multiple mcmc samplers (each of which can simultaneously run multiple
chains) can also be run in parallel by setting the execution plan with the
future
package. Only plan(multisession)
futures or
plan(cluster)
futures that don't use fork clusters are allowed,
since forked processes conflict with TensorFlow's parallelism. Explicitly
parallelising chains on a local machine with plan(multisession)
will
probably be slower than running multiple chains simultaneously in a single
sampler (with plan(sequential)
, the default) because of the overhead
required to start new sessions. However, plan(cluster)
can be used
to run chains on a cluster of machines on a local or remote network. See
future::cluster()
for details, and the
future.batchtools
package to set up plans on clusters with job
schedulers.
If n_cores = NULL
and mcmc samplers are being run sequentially, each
sampler will be allowed to use all CPU cores (possibly to compute multiple
chains sequentially). If samplers are being run in parallel with the
future
package, n_cores
will be set so that n_cores * [future::nbrOfWorkers]
is less than the number
of CPU cores.
After carrying out mcmc on all the model parameters, mcmc()
calculates the values of (i.e. traces) the parameters of interest for each
of these samples, similarly to calculate()
. Multiple
posterior samples can be traced simultaneously, though this can require
large amounts of memory for large models. As in calculate
, the
argument trace_batch_size
can be modified to trade-off speed against
memory usage.
If the sampler is aborted before finishing (and future
parallelism isn't being used), the samples collected so far can be
retrieved with stashed_samples()
. Only samples from the sampling
phase will be returned.
Samples returned by mcmc()
and stashed_samples()
can
be added to with extra_samples()
. This continues the chain from the
last value of the previous chain and uses the same sampler and model as was
used to generate the previous samples. It is not possible to change the
sampler or extend the warmup period.
Because opt()
acts on a list of greta arrays with possibly
varying dimension, the par
and hessian
objects returned by
opt()
are named lists, rather than a vector (par
) and a
matrix (hessian
), as returned by stats::optim()
.
Because greta arrays may not be vectors, the Hessians may not be matrices,
but could be higher-dimensional arrays. To return a Hessian matrix covering
multiple model parameters, you can construct your model so that all those
parameters are in a vector, then split the vector up to define the model.
The parameter vector can then be passed to model. See example.