Learn R Programming

spatstat (version 1.48-0)

ppm.ppp: Fit Point Process Model to Point Pattern Data

Description

Fits a point process model to an observed point pattern.

Usage

"ppm"(Q, trend=~1, interaction=Poisson(), ..., covariates=data, data=NULL, covfunargs = list(), subset, correction="border", rbord=reach(interaction), use.gam=FALSE, method="mpl", forcefit=FALSE, emend=project, project=FALSE, prior.mean = NULL, prior.var = NULL, nd = NULL, eps = NULL, gcontrol=list(), nsim=100, nrmh=1e5, start=NULL, control=list(nrep=nrmh), verb=TRUE, callstring=NULL)
"ppm"(Q, trend=~1, interaction=Poisson(), ..., covariates=data, data=NULL, covfunargs = list(), subset, correction="border", rbord=reach(interaction), use.gam=FALSE, method="mpl", forcefit=FALSE, emend=project, project=FALSE, prior.mean = NULL, prior.var = NULL, nd = NULL, eps = NULL, gcontrol=list(), nsim=100, nrmh=1e5, start=NULL, control=list(nrep=nrmh), verb=TRUE, callstring=NULL)

Arguments

Q
A data point pattern (of class "ppp") to which the model will be fitted, or a quadrature scheme (of class "quad") containing this pattern.
trend
An R formula object specifying the spatial trend to be fitted. The default formula, ~1, indicates the model is stationary and no trend is to be fitted.
interaction
An object of class "interact" describing the point process interaction structure, or NULL indicating that a Poisson process (stationary or nonstationary) should be fitted.
...
Ignored.
data,covariates
The values of any spatial covariates (other than the Cartesian coordinates) required by the model. Either a data frame, or a list whose entries are images, functions, windows, tessellations or single numbers. See Details.
subset
Optional. An expression (typically involving the Cartesian coordinates x and y or the names of entries in data) defining a subset of the spatial domain, to which the model-fitting should be restricted.
covfunargs
A named list containing the values of any additional arguments required by covariate functions.
correction
The name of the edge correction to be used. The default is "border" indicating the border correction. Other possibilities may include "Ripley", "isotropic", "periodic", "translate" and "none", depending on the interaction.
rbord
If correction = "border" this argument specifies the distance by which the window should be eroded for the border correction.
use.gam
Logical flag; if TRUE then computations are performed using gam instead of glm.
method
The method used to fit the model. Options are "mpl" for the method of Maximum PseudoLikelihood, "logi" for the Logistic Likelihood method, "VBlogi" for the Variational Bayes Logistic Likelihood method, and "ho" for the Huang-Ogata approximate maximum likelihood method.
forcefit
Logical flag for internal use. If forcefit=FALSE, some trivial models will be fitted by a shortcut. If forcefit=TRUE, the generic fitting method will always be used.
emend,project
(These are equivalent: project is an older name for emend.) Logical value. Setting emend=TRUE will ensure that the fitted model is always a valid point process by applying emend.ppm.
prior.mean
Optional vector of prior means for canonical parameters (for method="VBlogi"). See Details.
prior.var
Optional prior variance covariance matrix for canonical parameters (for method="VBlogi"). See Details.
nd
Optional. Integer or pair of integers. The dimension of the grid of dummy points (nd * nd or nd[1] * nd[2]) used to evaluate the integral in the pseudolikelihood. Incompatible with eps.
eps
Optional. A positive number, or a vector of two positive numbers, giving the horizontal and vertical spacing, respectively, of the grid of dummy points. Incompatible with nd.
gcontrol
Optional. List of parameters passed to glm.control (or passed to gam.control if use.gam=TRUE) controlling the model-fitting algorithm.
nsim
Number of simulated realisations to generate (for method="ho")
nrmh
Number of Metropolis-Hastings iterations for each simulated realisation (for method="ho")
start,control
Arguments passed to rmh controlling the behaviour of the Metropolis-Hastings algorithm (for method="ho")
verb
Logical flag indicating whether to print progress reports (for method="ho")
callstring
Internal use only.

Value

An object of class "ppm" describing a fitted point process model.See ppm.object for details of the format of this object and methods available for manipulating it.

Interaction parameters

Apart from the Poisson model, every point process model fitted by ppm has parameters that determine the strength and range of ‘interaction’ or dependence between points. These parameters are of two types:
Typically, regular parameters determine the ‘strength’ of the interaction, while irregular parameters determine the ‘range’ of the interaction. For example, the Strauss process has a regular parameter $gamma$ controlling the strength of interpoint inhibition, and an irregular parameter $r$ determining the range of interaction. The ppm command is only designed to estimate regular parameters of the interaction. It requires the values of any irregular parameters of the interaction to be fixed. For example, to fit a Strauss process model to the cells dataset, you could type ppm(cells, ~1, Strauss(r=0.07)). Note that the value of the irregular parameter r must be given. The result of this command will be a fitted model in which the regular parameter $gamma$ has been estimated. To determine the irregular parameters, there are several practical techniques, but no general statistical theory available. One useful technique is maximum profile pseudolikelihood, which is implemented in the command profilepl.

Error and Warning Messages

Some common error messages and warning messages are listed below, with explanations.

Warnings

The implementation of the Huang-Ogata method is experimental; several bugs were fixed in spatstat 1.19-0. See the comments above about the possible inefficiency and bias of the maximum pseudolikelihood estimator. The accuracy of the Berman-Turner approximation to the pseudolikelihood depends on the number of dummy points used in the quadrature scheme. The number of dummy points should at least equal the number of data points. The parameter values of the fitted model do not necessarily determine a valid point process. Some of the point process models are only defined when the parameter values lie in a certain subset. For example the Strauss process only exists when the interaction parameter $gamma$ is less than or equal to $1$, corresponding to a value of ppm()$theta[2] less than or equal to 0. By default (if emend=FALSE) the algorithm maximises the pseudolikelihood without constraining the parameters, and does not apply any checks for sanity after fitting the model. This is because the fitted parameter value could be useful information for data analysis. To constrain the parameters to ensure that the model is a valid point process, set emend=TRUE. See also the functions valid.ppm and emend.ppm. The trend formula should not use any variable names beginning with the prefixes .mpl or Interaction as these names are reserved for internal use. The data frame covariates should have as many rows as there are points in Q. It should not contain variables called x, y or marks as these names are reserved for the Cartesian coordinates and the marks. If the model formula involves one of the functions poly(), bs() or ns() (e.g. applied to spatial coordinates x and y), the fitted coefficients can be misleading. The resulting fit is not to the raw spatial variates (x, x^2, x*y, etc.) but to a transformation of these variates. The transformation is implemented by poly() in order to achieve better numerical stability. However the resulting coefficients are appropriate for use with the transformed variates, not with the raw variates. This affects the interpretation of the constant term in the fitted model, logbeta. Conventionally, $beta$ is the background intensity, i.e. the value taken by the conditional intensity function when all predictors (including spatial or ``trend'' predictors) are set equal to $0$. However the coefficient actually produced is the value that the log conditional intensity takes when all the predictors, including the transformed spatial predictors, are set equal to 0, which is not the same thing. Worse still, the result of predict.ppm can be completely wrong if the trend formula contains one of the functions poly(), bs() or ns(). This is a weakness of the underlying function predict.glm. If you wish to fit a polynomial trend, we offer an alternative to poly(), namely polynom(), which avoids the difficulty induced by transformations. It is completely analogous to poly except that it does not orthonormalise. The resulting coefficient estimates then have their natural interpretation and can be predicted correctly. Numerical stability may be compromised. Values of the maximised pseudolikelihood are not comparable if they have been obtained with different values of rbord.

Details

NOTE: This help page describes the old syntax of the function ppm, described in many older documents. This old syntax is still supported. However, if you are learning about ppm for the first time, we recommend you use the new syntax described in the help file for ppm. This function fits a point process model to an observed point pattern. The model may include spatial trend, interpoint interaction, and dependence on covariates.

References

Baddeley, A., Coeurjolly, J.-F., Rubak, E. and Waagepetersen, R. (2014) Logistic regression for spatial Gibbs point processes. Biometrika 101 (2) 377--392.

Baddeley, A. and Turner, R. Practical maximum pseudolikelihood for spatial point patterns. Australian and New Zealand Journal of Statistics 42 (2000) 283--322. Berman, M. and Turner, T.R. Approximating point process likelihoods with GLIM. Applied Statistics 41 (1992) 31--38. Besag, J. Statistical analysis of non-lattice data. The Statistician 24 (1975) 179-195. Diggle, P.J., Fiksel, T., Grabarnik, P., Ogata, Y., Stoyan, D. and Tanemura, M. On parameter estimation for pairwise interaction processes. International Statistical Review 62 (1994) 99-117.

Huang, F. and Ogata, Y. Improvements of the maximum pseudo-likelihood estimators in various spatial statistical models. Journal of Computational and Graphical Statistics 8 (1999) 510-530. Jensen, J.L. and Moeller, M. Pseudolikelihood for exponential family models of spatial point processes. Annals of Applied Probability 1 (1991) 445--461. Jensen, J.L. and Kuensch, H.R. On asymptotic normality of pseudo likelihood estimates for pairwise interaction processes, Annals of the Institute of Statistical Mathematics 46 (1994) 475-486.

Rajala T. (2014) A note on Bayesian logistic regression for spatial exponential family Gibbs point processes, Preprint on ArXiv.org. http://arxiv.org/abs/1411.0539

See Also

ppm.object for details of how to print, plot and manipulate a fitted model.

ppp and quadscheme for constructing data. Interactions: \GibbsInteractionsList.

See profilepl for advice on fitting nuisance parameters in the interaction, and ippm for irregular parameters in the trend.

See valid.ppm and emend.ppm for ensuring the fitted model is a valid point process.

Examples

Run this code
 # fit the stationary Poisson process
 # to point pattern 'nztrees'

 ppm(nztrees)
 ppm(nztrees ~ 1)

 ## Not run: 
#  Q <- quadscheme(nztrees) 
#  ppm(Q) 
#  # equivalent.
#  ## End(Not run)

 ## Not run: 
#   ppm(nztrees, nd=128)
#  ## End(Not run)
 

fit1 <- ppm(nztrees, ~ x)
 # fit the nonstationary Poisson process 
 # with intensity function lambda(x,y) = exp(a + bx)
 # where x,y are the Cartesian coordinates
 # and a,b are parameters to be estimated

fit1
coef(fit1)
coef(summary(fit1))

## Not run: 
#  ppm(nztrees, ~ polynom(x,2))
# ## End(Not run)


 # fit the nonstationary Poisson process 
 # with intensity function lambda(x,y) = exp(a + bx + cx^2)

 ## Not run: 
#  library(splines)
#  ppm(nztrees, ~ bs(x,df=3))
#  ## End(Not run)
 #       WARNING: do not use predict.ppm() on this result
 # Fits the nonstationary Poisson process 
 # with intensity function lambda(x,y) = exp(B(x))
 # where B is a B-spline with df = 3

## Not run: 
#  ppm(nztrees, ~1, Strauss(r=10), rbord=10)
# ## End(Not run)

 # Fit the stationary Strauss process with interaction range r=10
 # using the border method with margin rbord=10

## Not run: 
#  ppm(nztrees, ~ x, Strauss(13), correction="periodic")
# ## End(Not run)

 # Fit the nonstationary Strauss process with interaction range r=13
 # and exp(first order potential) =  activity = beta(x,y) = exp(a+bx)
 # using the periodic correction.


# Compare Maximum Pseudolikelihood, Huang-Ogata and VB fits:
## Not run: ppm(swedishpines, ~1, Strauss(9))

## Not run: ppm(swedishpines, ~1, Strauss(9), method="ho")


ppm(swedishpines, ~1, Strauss(9), method="VBlogi")

 # COVARIATES
 #
 X <- rpoispp(42)
 weirdfunction <- function(x,y){ 10 * x^2 + 5 * sin(10 * y) }
 #
 # (a) covariate values as function
 ppm(X, ~ y + Z, covariates=list(Z=weirdfunction))
 #
 # (b) covariate values in pixel image
 Zimage <- as.im(weirdfunction, unit.square())
 ppm(X, ~ y + Z, covariates=list(Z=Zimage))
 #
 # (c) covariate values in data frame
 Q <- quadscheme(X)
 xQ <- x.quad(Q)
 yQ <- y.quad(Q)
 Zvalues <- weirdfunction(xQ,yQ)
 ppm(Q, ~ y + Z, covariates=data.frame(Z=Zvalues))
 # Note Q not X

 # COVARIATE FUNCTION WITH EXTRA ARGUMENTS
 #
f <- function(x,y,a){ y - a }
ppm(X, ~x + f, covariates=list(f=f), covfunargs=list(a=1/2))

 # COVARIATE: inside/outside window
 b <- owin(c(0.1, 0.6), c(0.1, 0.9))
 ppm(X, ~w, covariates=list(w=b))

 ## MULTITYPE POINT PROCESSES ### 
 # fit stationary marked Poisson process
 # with different intensity for each species
## Not run: ppm(lansing, ~ marks, Poisson())


 # fit nonstationary marked Poisson process
 # with different log-cubic trend for each species
## Not run: ppm(lansing, ~ marks * polynom(x,y,3), Poisson())


Run the code above in your browser using DataLab