Learn R Programming

PrevMap (version 1.5.4)

poisson.log.MCML: Monte Carlo Maximum Likelihood estimation for the Poisson model

Description

This function performs Monte Carlo maximum likelihood (MCML) estimation for the geostatistical Poisson model with log link function.

Usage

poisson.log.MCML(
  formula,
  units.m = NULL,
  coords,
  data,
  ID.coords = NULL,
  par0,
  control.mcmc,
  kappa,
  fixed.rel.nugget = NULL,
  start.cov.pars,
  method = "BFGS",
  low.rank = FALSE,
  knots = NULL,
  messages = TRUE,
  plot.correlogram = TRUE
)

Arguments

formula

an object of class formula (or one that can be coerced to that class): a symbolic description of the model to be fitted.

units.m

an object of class formula indicating the multiplicative offset for the mean of the Poisson model; if not specified this is then internally set as 1.

coords

an object of class formula indicating the geographic coordinates.

data

a data frame containing the variables in the model.

ID.coords

vector of ID values for the unique set of spatial coordinates obtained from create.ID.coords. These must be provided if, for example, spatial random effects are defined at location-level but some of the covariates are at individual level. Warning: the spatial coordinates must all be distinct otherwise see jitterDupCoords. Default is NULL.

par0

parameters of the importance sampling distribution: these should be given in the following order c(beta,sigma2,phi,tau2), where beta are the regression coefficients, sigma2 is the variance of the Gaussian process, phi is the scale parameter of the spatial correlation and tau2 is the variance of the nugget effect (if included in the model).

control.mcmc

output from control.mcmc.MCML.

kappa

fixed value for the shape parameter of the Matern covariance function.

fixed.rel.nugget

fixed value for the relative variance of the nugget effect; fixed.rel.nugget=NULL if this should be included in the estimation. Default is fixed.rel.nugget=NULL.

start.cov.pars

a vector of length two with elements corresponding to the starting values of phi and the relative variance of the nugget effect nu2, respectively, that are used in the optimization algorithm. If nu2 is fixed through fixed.rel.nugget, then start.cov.pars represents the starting value for phi only.

method

method of optimization. If method="BFGS" then the maxBFGS function is used; otherwise method="nlminb" to use the nlminb function. Default is method="BFGS".

low.rank

logical; if low.rank=TRUE a low-rank approximation of the Gaussian spatial process is used when fitting the model. Default is low.rank=FALSE.

knots

if low.rank=TRUE, knots is a matrix of spatial knots that are used in the low-rank approximation. Default is knots=NULL.

messages

logical; if messages=TRUE then status messages are printed on the screen (or output device) while the function is running. Default is messages=TRUE.

plot.correlogram

logical; if plot.correlogram=TRUE the autocorrelation plot of the samples of the random effect is displayed after completion of conditional simulation. Default is plot.correlogram=TRUE.

Value

An object of class "PrevMap". The function summary.PrevMap is used to print a summary of the fitted model. The object is a list with the following components:

estimate: estimates of the model parameters; use the function coef.PrevMap to obtain estimates of covariance parameters on the original scale.

covariance: covariance matrix of the MCML estimates.

log.lik: maximum value of the log-likelihood.

y: observations.

units.m: offset.

D: matrix of covariates.

ID.coords: set of ID values defined through the argument ID.coords.

coords: matrix of the observed sampling locations.

method: method of optimization used.

kappa: fixed value of the shape parameter of the Matern function.

knots: matrix of the spatial knots used in the low-rank approximation.

const.sigma2: adjustment factor for sigma2 in the low-rank approximation.

h: vector of the values of the tuning parameter at each iteration of the Langevin-Hastings MCMC algorithm; see Laplace.sampling, or Laplace.sampling.lr if a low-rank approximation is used.

samples: matrix of the random effects samples from the importance sampling distribution used to approximate the likelihood function.

fixed.rel.nugget: fixed value for the relative variance of the nugget effect.

call: the matched call.

Details

This function performs parameter estimation for a geostatistical Poisson model with log link function. Conditionally on a zero-mean stationary Gaussian process \(S(x)\) and mutually independent zero-mean Gaussian variables \(Z\) with variance tau2, the observations y are generated from a Poisson distribution with mean \(m\lambda\), where \(m\) is an offset defined through the argument units.m. A canonical log link is used, thus the linear predictor assumes the form $$\log(\lambda) = d'\beta + S(x) + Z,$$ where \(d\) is a vector of covariates with associated regression coefficients \(\beta\). The Gaussian process \(S(x)\) has isotropic Matern covariance function (see matern) with variance sigma2, scale parameter phi and shape parameter kappa. In the poisson.log.MCML function, the shape parameter is treated as fixed. The relative variance of the nugget effect, nu2=tau2/sigma2, can also be fixed through the argument fixed.rel.nugget; if fixed.rel.nugget=NULL, then the relative variance of the nugget effect is also included in the estimation.

Monte Carlo Maximum likelihood. The Monte Carlo maximum likelihood method uses conditional simulation from the distribution of the random effect \(T(x) = d(x)'\beta+S(x)+Z\) given the data y, in order to approximate the high-dimensiional intractable integral given by the likelihood function. The resulting approximation of the likelihood is then maximized by a numerical optimization algorithm which uses analytic epression for computation of the gradient vector and Hessian matrix. The functions used for numerical optimization are maxBFGS (method="BFGS"), from the maxLik package, and nlminb (method="nlminb").

Low-rank approximation. In the case of very large spatial data-sets, a low-rank approximation of the Gaussian spatial process \(S(x)\) might be computationally beneficial. Let \((x_{1},\dots,x_{m})\) and \((t_{1},\dots,t_{m})\) denote the set of sampling locations and a grid of spatial knots covering the area of interest, respectively. Then \(S(x)\) is approximated as \(\sum_{i=1}^m K(\|x-t_{i}\|; \phi, \kappa)U_{i}\), where \(U_{i}\) are zero-mean mutually independent Gaussian variables with variance sigma2 and \(K(.;\phi, \kappa)\) is the isotropic Matern kernel (see matern.kernel). Since the resulting approximation is no longer a stationary process (but only approximately), the parameter sigma2 is then multiplied by a factor constant.sigma2 so as to obtain a value that is closer to the actual variance of \(S(x)\).

References

Diggle, P.J., Giorgi, E. (2019). Model-based Geostatistics for Global Public Health. CRC/Chapman & Hall.

Giorgi, E., Diggle, P.J. (2017). PrevMap: an R package for prevalence mapping. Journal of Statistical Software. 78(8), 1-29. doi: 10.18637/jss.v078.i08

Christensen, O. F. (2004). Monte carlo maximum likelihood in model-based geostatistics. Journal of Computational and Graphical Statistics 13, 702-718.

Higdon, D. (1998). A process-convolution approach to modeling temperatures in the North Atlantic Ocean. Environmental and Ecological Statistics 5, 173-190.

See Also

Laplace.sampling, Laplace.sampling.lr, summary.PrevMap, coef.PrevMap, matern, matern.kernel, control.mcmc.MCML.