The largest single factor to affect the run-time of an algorithm --
whether it is with IterativeQuadrature
,
LaplaceApproximation
, LaplacesDemon
,
PMC
, or VariationalBayes
-- is the time
that it takes to evaluate the model specification. This has also been
observed in Rosenthal (2007). It is highly recommended that a user of
the LaplacesDemon
package should attempt to reduce the run-time
of the model specification, usually by testing alternate forms of code
for speed. This is especially true with big data, such as with the
BigData
function.
Every function in the LaplacesDemon package is byte-compiled, which is
a recent option in R. This reduces run-time, thanks to Tierney's
compiler package in base R. The model specification, however, is
specified by the user, and should be byte-compiled. The reduction in
run-time may range from mild to dramatic, depending on the model. It
is highly recommended that users concerned with run-time activate the
compiler package and use the cmpfun
function, as per the
example below.
A model specification function that is optimized for speed and
involves many records may result in a model update run-time that is
considerably less than other popular MCMC-based software algorithms
that loop through records, even when those algorithms are coded in
C
or other fast languages. For a comparison, see the
``Laplace's Demon Tutorial'' vignette.
However, if a model specification function in the LaplacesDemon
package is not fully vectorized (contains for
loops and
apply
functions), then run-time will typically be slower than
other popular MCMC-based software algorithms.
The speed of calculating the model specification function is
affected by parameter constraints, such as with the
interval
function. Parameter constraints may add
considerable variability to the speed of this calculation, and usually
more variation occurs with initial values that are far from the target
distributions.
Distributions in the LaplacesDemon
package usually have logical
checks to ensure correctness. These checks may slow the
calculation of the density, for example. If the user is confident
these checks are unnecessary for their model, then the user may
copy the code to a new function name and comment-out the checks to
improve speed.
When speed is of paramount importance, a user is encouraged to
experiment with writing the model specification function in another
language such as in C++
with the Rcpp
package, and
calling drop-in replacement functions from within the Model
function, or re-writing the model function entirely in C++
.
For an introduction to including C++
in LaplacesDemon,
see https://web.archive.org/web/20150227225556/http://www.bayesian-inference.com/softwarearticlescppsugar.
When a model specification function is computationally expensive,
another approach to reduce run-time may be for the user to write
parallelized code within the model, splitting up difficult tasks and
assigning each to a separate CPU.
Another use for Model.Spec.Time
is to allow the user to make an
informed decision about which MCMC algorithm to select, given the
speed of their model specification. For example, the Adaptive
Metropolis-within-Gibbs (AMWG) of Roberts and Rosenthal (2009) is
currently the adaptive MCMC algorithm of choice in general in many
cases, but this choice is conditional on run-time. While other
MCMC algorithms in LaplacesDemon
evaluate the model
specification function once per iteration, componentwise algorithms
such as in the MWG family evaluate it once per parameter per
iteration, significantly increasing run-time per iteration in large
models. The Model.Spec.Time
function may forewarn the user of
the associated run-time, and if it should be decided to go with an
alternate MCMC algorithm, such as AMM, in which each element of its
covariance matrix must stabilize for the chains to become
stationary. AMM, for example, will require many more iterations of
burn-in (for more information, see the burnin
function),
but with numerous iterations, allows more thinning. A general
recommendation may be to use AMWG when
Componentwise.Iters.per.Minute
>= 1000, but this is subjective
and best determined by each user for each model. A better decision may
be made by comparing MCMC algorithms with the Juxtapose
function for a particular model.
Following are a few common suggestions for increasing the speed of
R
code in the model specification function. There are often
exceptions to these suggestions, so case-by-case experimentation is
also suggested.
Replace exponents with multiplied terms, such as x^2
with x*x
.
Replace mean(x)
with sum(x)/length(x)
.
Replace parentheses (when possible) with curly brackets, such
as x*(a+b)
with x*{a+b}
.
Replace tcrossprod(Data$X, t(beta))
with
Data$X %*% beta
when there are few predictors, and avoid
tcrossprod(beta, Data$X)
, which is always slowest.
Vectorize functions and eliminate apply
and for
functions. There are often specialized functions. For example,
max.col(X)
is faster than apply(X, 1, which.max)
.
When seeking speed, things to consider beyond the LaplacesDemon
package are the Basic Linear Algebra System (BLAS) and hardware. The
ATLAS (Automatically Tuned Linear Algebra System) should be worth
installing (and there are several alternatives), but this discussion
is beyond the scope of this documentation. Lastly, the speed at which
the computer can process iterations is limited by its hardware, and
more should be considered than merely the CPU (Central Processing
Unit). Again, though, this is beyond the scope of this documentation.