Learn R Programming

lgcp (version 1.8)

lgcpPredictAggregateSpatialPlusPars: lgcpPredictAggregateSpatialPlusPars function

Description

A function to deliver fully Bayesian inference for the aggregated spatial log-Gaussian Cox process.

Usage

lgcpPredictAggregateSpatialPlusPars(
  formula,
  spdf,
  Zmat = NULL,
  overlayInZmat = FALSE,
  model.priors,
  model.inits = lgcpInits(),
  spatial.covmodel,
  cellwidth = NULL,
  poisson.offset = NULL,
  mcmc.control,
  output.control = setoutput(),
  gradtrunc = Inf,
  ext = 2,
  Nfreq = 101,
  inclusion = "touching",
  overlapping = FALSE,
  pixwts = NULL
)

Arguments

formula

a formula object of the form X ~ var1 + var2 etc. The name of the dependent variable must be "X". Only accepts 'simple' formulae, such as the example given.

spdf

a SpatialPolygonsDataFrame object with variable "X", the event counts per region.

Zmat

design matrix Z (see below) constructed with getZmat

overlayInZmat

if the covariate information in Zmat also comes from spdf, set to TRUE to avoid replicating the overlay operations. Default is FALSE.

model.priors

model priors, set using lgcpPrior

model.inits

model initial values. The default is NULL, in which case lgcp will use the prior mean to initialise eta and beta will be initialised from an oversispersed glm fit to the data. Otherwise use lgcpInits to specify.

spatial.covmodel

choice of spatial covariance function. See ?CovFunction

cellwidth

the width of computational cells

poisson.offset

A SpatialAtRisk object defining lambda (see below)

mcmc.control

MCMC paramters, see ?mcmcpars

output.control

output choice, see ?setoutput

gradtrunc

truncation for gradient vector equal to H parameter Moller et al 1998 pp 473. Default is Inf, which means no gradient truncation, which seems to work in most settings.

ext

integer multiple by which grid should be extended, default is 2. Generally this will not need to be altered, but if the spatial correlation decays slowly, increasing 'ext' may be necessary.

Nfreq

the sampling frequency for the cell counts. Default is every 101 iterations.

inclusion

criterion for cells being included into observation window. Either 'touching' or 'centroid'. The former, the default, includes all cells that touch the observation window, the latter includes all cells whose centroids are inside the observation window.

overlapping

logical does spdf contain overlapping polygons? Default is FALSE. If set to TRUE, spdf can contain a variable named 'sintens' that gives the sampling intensity for each polygon; the default is to assume that cases are evenly split between overlapping regions.

pixwts

optional matrix of dimension (NM) x (number of regions in spdf) where M, N are the number of cells in the x and y directions (not the number of cells on the Fourier grid, rather the number of cell on the output grid). The ith row of this matrix are the probabilities that for the ith grid cell (in the same order as expand.grid(mcens,ncens)) a case belongs to each of the regions in spdf. Including this object overrides 'sintens' in the overlapping option above.

Value

an object of class lgcpPredictAggregateSpatialPlusParameters

Details

See the vignette "Bayesian_lgcp" for examples of this code in use.

In this case, we OBSERVE case counts in the regions of a SpatialPolygonsDataFrame; the counts are stored as a variable, X. The model for the UNOBSERVED data, X(s), is as follows:

X(s) ~ Poisson[R(s)]

R(s) = C_A lambda(s) exp[Z(s)beta+Y(s)]

Here X(s) is the number of events in the cell of the computational grid containing s, R(s) is the Poisson rate, C_A is the cell area, lambda(s) is a known offset, Z(s) is a vector of measured covariates and Y(s) is the latent Gaussian process on the computational grid. The other parameters in the model are beta, the covariate effects; and eta=[log(sigma),log(phi)], the parameters of the process Y on an appropriately transformed (in this case log) scale.

We recommend the user takes the following steps before running this method:

  1. Compute approximate values of the parameters, eta, of the process Y using the function minimum.contrast. These approximate values are used for two main reasons: (1) to help inform the size of the computational grid, since we will need to use a cell width that enables us to capture the dependence properties of Y and (2) to help inform the proposal kernel for the MCMC algorithm.

  2. Choose an appropriate grid on which to perform inference using the function chooseCellwidth; this will partly be determined by the results of the first stage and partly by the available computational resource available to perform inference.

  3. Using the function getpolyol, construct the computational grid and polygon overlays, as required. As this can be an expensive step, we recommend that the user saves this object after it has been constructed and in future reference to the data, reloads this object, rather than having to re-compute it (provided the computational grid has not changed).

  4. Decide on which covariates are to play a part in the analysis and use the lgcp function getZmat to interpolate these onto the computational grid. Note that having saved the results from the previous step, this is a relatively quick operation, and allows the user to quickly construct different design matrices, Z, from different candidate models for the data

  5. If required, set up the population offset using SpatialAtRisk functions (see the vignette "Bayesian_lgcp"); specify the priors using lgcpPrior; and if desired, the initial values for the MCMC, using the function lgcpInits.

  6. Run the MCMC algorithm and save the output to disk. We recommend dumping information to disk using the dump2dir function in the output.control argument because it offers much greater flexibility in terms of MCMC diagnosis and post-processing.

  7. Perform post-processing analyses including MCMC diagnostic checks and produce summaries of the posterior expectations we require for presentation. (see the vignette "Bayesian_lgcp" for further details). Functions of use in this step include traceplots, autocorr, parautocorr, ltar, parsummary, priorpost, postcov, textsummary, expectation, exceedProbs and lgcp:::expectation.lgcpPredict

References

  1. Benjamin M. Taylor, Tilman M. Davies, Barry S. Rowlingson, Peter J. Diggle. Bayesian Inference and Data Augmentation Schemes for Spatial, Spatiotemporal and Multivariate Log-Gaussian Cox Processes in R. Submitted.

  2. Benjamin M. Taylor, Tilman M. Davies, Barry S. Rowlingson, Peter J. Diggle (2013). Journal of Statistical Software, 52(4), 1-40. URL http://www.jstatsoft.org/v52/i04/

  3. Brix A, Diggle PJ (2001). Spatiotemporal Prediction for log-Gaussian Cox processes. Journal of the Royal Statistical Society, Series B, 63(4), 823-841.

  4. Diggle P, Rowlingson B, Su T (2005). Point Process Methodology for On-line Spatio-temporal Disease Surveillance. Environmetrics, 16(5), 423-434.

  5. Wood ATA, Chan G (1994). Simulation of Stationary Gaussian Processes in [0,1]d. Journal of Computational and Graphical Statistics, 3(4), 409-432.

  6. Moller J, Syversveen AR, Waagepetersen RP (1998). Log Gaussian Cox Processes. Scandinavian Journal of Statistics, 25(3), 451-482.

See Also

linkchooseCellWidth, getpolyol, guessinterp, getZmat, addTemporalCovariates, lgcpPrior, lgcpInits, CovFunction lgcpPredictSpatialPlusPars, lgcpPredictSpatioTemporalPlusPars, lgcpPredictMultitypeSpatialPlusPars, ltar, autocorr, parautocorr, traceplots, parsummary, textsummary, priorpost, postcov, exceedProbs, betavals, etavals