This function performs Bayesian estimation for a geostatistical binomial logistic model.
binomial.logistic.Bayes(
formula,
units.m,
coords,
data,
ID.coords = NULL,
control.prior,
control.mcmc,
kappa,
low.rank = FALSE,
knots = NULL,
messages = TRUE,
mesh = NULL,
SPDE = FALSE
)
an object of class formula
(or one that can be coerced to that class): a symbolic description of the model to be fitted.
an object of class formula
indicating the binomial denominators.
an object of class formula
indicating the geographic coordinates.
a data frame containing the variables in the model.
vector of ID values for the unique set of spatial coordinates obtained from create.ID.coords
. These must be provided if, for example, spatial random effects are defined at household level but some of the covariates are at individual level. Warning: the household coordinates must all be distinct otherwise see jitterDupCoords
. Default is NULL
.
output from control.prior
.
output from control.mcmc.Bayes
.
value for the shape parameter of the Matern covariance function.
logical; if low.rank=TRUE
a low-rank approximation is required. Default is low.rank=FALSE
.
if low.rank=TRUE
, knots
is a matrix of spatial knots used in the low-rank approximation. Default is knots=NULL
.
logical; if messages=TRUE
then status messages are printed on the screen (or output device) while the function is running. Default is messages=TRUE
.
an object obtained as result of a call to the function inla.mesh.2d
.
logical; if SPDE=TRUE
the SPDE approximation for the Gaussian spatial model is used. Default is SPDE=FALSE
.
An object of class "Bayes.PrevMap".
The function summary.Bayes.PrevMap
is used to print a summary of the fitted model.
The object is a list with the following components:
estimate
: matrix of the posterior samples of the model parameters.
S
: matrix of the posterior samples for each component of the random effect.
const.sigma2
: vector of the values of the multiplicative factor used to adjust the values of sigma2
in the low-rank approximation.
y
: binomial observations.
units.m
: binomial denominators.
D
: matrix of covariarates.
coords
: matrix of the observed sampling locations.
kappa
: shape parameter of the Matern function.
ID.coords
: set of ID values defined through the argument ID.coords
.
knots
: matrix of spatial knots used in the low-rank approximation.
h1
: vector of values taken by the tuning parameter h.theta1
at each iteration.
h2
: vector of values taken by the tuning parameter h.theta2
at each iteration.
h3
: vector of values taken by the tuning parameter h.theta3
at each iteration.
acc.beta.S
: empirical acceptance rate for the regression coefficients and random effects (only if SPDE=TRUE
).
mesh
: the mesh used in the SPDE approximation.
call
: the matched call.
This function performs Bayesian estimation for the parameters of the geostatistical binomial logistic model. Conditionally on a zero-mean stationary Gaussian process \(S(x)\) and mutually independent zero-mean Gaussian variables \(Z\) with variance tau2
, the linear predictor assumes the form
$$\log(p/(1-p)) = d'\beta + S(x) + Z,$$
where \(d\) is a vector of covariates with associated regression coefficients \(\beta\). The Gaussian process \(S(x)\) has isotropic Matern covariance function (see matern
) with variance sigma2
, scale parameter phi
and shape parameter kappa
.
Priors definition. Priors can be defined through the function control.prior
. The hierarchical structure of the priors is the following. Let \(\theta\) be the vector of the covariance parameters c(sigma2,phi,tau2)
; then each component of \(\theta\) has independent priors freely defined by the user. However, in control.prior
uniform and log-normal priors are also available as default priors for each of the covariance parameters. To remove the nugget effect \(Z\), no prior should be defined for tau2
. Conditionally on sigma2
, the vector of regression coefficients beta
has a multivariate Gaussian prior with mean beta.mean
and covariance matrix sigma2*beta.covar
, while in the low-rank approximation the covariance matrix is simply beta.covar
.
Updating the covariance parameters with a Metropolis-Hastings algorithm. In the MCMC algorithm implemented in binomial.logistic.Bayes
, the transformed parameters $$(\theta_{1}, \theta_{2}, \theta_{3})=(\log(\sigma^2)/2,\log(\sigma^2/\phi^{2 \kappa}), \log(\tau^2))$$ are independently updated using a Metropolis Hastings algorithm. At the \(i\)-th iteration, a new value is proposed for each from a univariate Gaussian distrubion with variance \(h_{i}^2\) that is tuned using the following adaptive scheme $$h_{i} = h_{i-1}+c_{1}i^{-c_{2}}(\alpha_{i}-0.45),$$ where \(\alpha_{i}\) is the acceptance rate at the \(i\)-th iteration, 0.45 is the optimal acceptance rate for a univariate Gaussian distribution, whilst \(c_{1} > 0\) and \(0 < c_{2} < 1\) are pre-defined constants. The starting values \(h_{1}\) for each of the parameters \(\theta_{1}\), \(\theta_{2}\) and \(\theta_{3}\) can be set using the function control.mcmc.Bayes
through the arguments h.theta1
, h.theta2
and h.theta3
. To define values for \(c_{1}\) and \(c_{2}\), see the documentation of control.mcmc.Bayes
.
Hamiltonian Monte Carlo. The MCMC algorithm in binomial.logistic.Bayes
uses a Hamiltonian Monte Carlo (HMC) procedure to update the random effect \(T=d'\beta + S(x) + Z\); see Neal (2011) for an introduction to HMC. HMC makes use of a postion vector, say \(t\), representing the random effect \(T\), and a momentum vector, say \(q\), of the same length of the position vector, say \(n\). Hamiltonian dynamics also have a physical interpretation where the states of the system are described by the position of a puck and its momentum (its mass times its velocity). The Hamiltonian function is then defined as a function of \(t\) and \(q\), having the form \(H(t,q) = -\log\{f(t | y, \beta, \theta)\} + q'q/2\), where \(f(t | y, \beta, \theta)\) is the conditional distribution of \(T\) given the data \(y\), the regression parameters \(\beta\) and covariance parameters \(\theta\). The system of Hamiltonian equations then defines the evolution of the system in time, which can be used to implement an algorithm for simulation from the posterior distribution of \(T\). In order to implement the Hamiltonian dynamic on a computer, the Hamiltonian equations must be discretised. The leapfrog method is then used for this purpose, where two tuning parameters should be defined: the stepsize \(\epsilon\) and the number of steps \(L\). These respectively correspond to epsilon.S.lim
and L.S.lim
in the control.mcmc.Bayes
function. However, it is advisable to let \(epsilon\) and \(L\) take different random values at each iteration of the HCM algorithm so as to account for the different variances amongst the components of the posterior of \(T\). This can be done in control.mcmc.Bayes
by defning epsilon.S.lim
and L.S.lim
as vectors of two elements, each of which represents the lower and upper limit of a uniform distribution used to generate values for epsilon.S.lim
and L.S.lim
, respectively.
Using a two-level model to include household-level and individual-level information. When analysing data from household sruveys, some of the avilable information information might be at household-level (e.g. material of house, temperature) and some at individual-level (e.g. age, gender). In this case, the Gaussian spatial process \(S(x)\) and the nugget effect \(Z\) are defined at hosuehold-level in order to account for extra-binomial variation between and within households, respectively.
Low-rank approximation.
In the case of very large spatial data-sets, a low-rank approximation of the Gaussian spatial process \(S(x)\) might be computationally beneficial. Let \((x_{1},\dots,x_{m})\) and \((t_{1},\dots,t_{m})\) denote the set of sampling locations and a grid of spatial knots covering the area of interest, respectively. Then \(S(x)\) is approximated as \(\sum_{i=1}^m K(\|x-t_{i}\|; \phi, \kappa)U_{i}\), where \(U_{i}\) are zero-mean mutually independent Gaussian variables with variance sigma2
and \(K(.;\phi, \kappa)\) is the isotropic Matern kernel (see matern.kernel
). Since the resulting approximation is no longer a stationary process (but only approximately), sigma2
may take very different values from the actual variance of the Gaussian process to approximate. The function adjust.sigma2
can then be used to (approximately) explore the range for sigma2
. For example if the variance of the Gaussian process is 0.5
, then an approximate value for sigma2
is 0.5/const.sigma2
, where const.sigma2
is the value obtained with adjust.sigma2
.
Diggle, P.J., Giorgi, E. (2019). Model-based Geostatistics for Global Public Health. CRC/Chapman & Hall.
Giorgi, E., Diggle, P.J. (2017). PrevMap: an R package for prevalence mapping. Journal of Statistical Software. 78(8), 1-29. doi: 10.18637/jss.v078.i08
Neal, R. M. (2011) MCMC using Hamiltonian Dynamics, In: Handbook of Markov Chain Monte Carlo (Chapter 5), Edited by Steve Brooks, Andrew Gelman, Galin Jones, and Xiao-Li Meng Chapman & Hall / CRC Press.
Higdon, D. (1998). A process-convolution approach to modeling temperatures in the North Atlantic Ocean. Environmental and Ecological Statistics 5, 173-190.
control.mcmc.Bayes
, control.prior
,summary.Bayes.PrevMap
, matern
, matern.kernel
, create.ID.coords
.