Given a multitype point pattern, this function estimates the spatially-varying probability of each type of point, or the ratios of such probabilities, using kernel smoothing. The default smoothing bandwidth is selected by cross-validation.
# S3 method for ppp
relrisk(X, sigma = NULL, ...,
at = c("pixels", "points"),
weights = NULL, varcov = NULL,
relative=FALSE,
adjust=1, edge=TRUE, diggle=FALSE,
se=FALSE, wtype=c("value", "multiplicity"),
casecontrol=TRUE, control=1, case, fudge=0)
If se=FALSE
(the default), the format is described below.
If se=TRUE
, the result is a list of two entries,
estimate
and SE
, each having the format described below.
If X
consists of only two types of points,
and if casecontrol=TRUE
,
the result is a pixel image (if at="pixels"
)
or a vector (if at="points"
).
The pixel values or vector values
are the probabilities of a case if relative=FALSE
,
or the relative risk of a case (probability of a case divided by the
probability of a control) if relative=TRUE
.
If X
consists of more than two types of points,
or if casecontrol=FALSE
, the result is:
(if at="pixels"
)
a list of pixel images, with one image for each possible type of point.
The result also belongs to the class "solist"
so that it can
be printed and plotted.
(if at="points"
)
a matrix of probabilities, with rows corresponding to
data points \(x_i\), and columns corresponding
to types \(j\).
The pixel values or matrix entries
are the probabilities of each type of point if relative=FALSE
,
or the relative risk of each type (probability of each type divided by the
probability of a control) if relative=TRUE
.
If relative=FALSE
, the resulting values always lie between 0
and 1. If relative=TRUE
, the results are either non-negative
numbers, or the values Inf
or NA
.
A multitype point pattern (object of class "ppp"
which has factor valued marks).
Optional. The numeric value of the smoothing bandwidth
(the standard deviation of isotropic
Gaussian smoothing kernel).
Alternatively sigma
may be a function which can be used
to select a different bandwidth for each type of point. See Details.
Arguments passed to bw.relrisk
to select the
bandwidth, or passed to density.ppp
to control the
pixel resolution.
Character string specifying whether to compute the probability values
at a grid of pixel locations (at="pixels"
) or
only at the points of X
(at="points"
).
Optional. Weights for the data points of X
.
A numeric vector, an expression
, or a pixel image.
Optional. Variance-covariance matrix of anisotopic Gaussian
smoothing kernel. Incompatible with sigma
.
Logical.
If FALSE
(the default) the algorithm
computes the probabilities of each type of point.
If TRUE
, it computes the
relative risk, the ratio of probabilities
of each type relative to the probability of a control.
Optional. Adjustment factor for the bandwidth sigma
.
Logical value indicating whether to apply edge correction.
Logical. If TRUE
, use the Jones-Diggle improved edge correction,
which is more accurate but slower to compute than the default
correction.
Logical value indicating whether to compute standard errors as well.
Character string (partially matched) specifying how the weights should be interpreted for the calculation of standard error. See Details.
Logical. Whether to treat a bivariate point pattern as consisting of cases and controls, and return only the probability or relative risk of a case. Ignored if there are more than 2 types of points. See Details.
Integer, or character string, identifying which mark value corresponds to a control.
Integer, or character string, identifying which mark value
corresponds to a case (rather than a control)
in a bivariate point pattern.
This is an alternative to the argument control
in a bivariate point pattern.
Ignored if there are more than 2 types of points.
Optional. A single numeric value, or a numeric vector with one entry for each type of point. This value will be added to the estimates of point process intensity, before calculation of the relative risk.
If se=TRUE
, the standard error of the estimate will also be
calculated. The calculation assumes a Poisson point process.
If weights
are given, then the calculation of standard error
depends on the interpretation of the weights. This is controlled by
the argument wtype
.
If wtype="value"
(the default),
the weights are interpreted as numerical values observed
at the data locations. Roughly speaking,
standard errors are proportional to the absolute
values of the weights.
If wtype="multiplicity"
the weights are interpreted as
multiplicities so that a weight of 2 is equivalent to having a pair
of duplicated points at the data location. Roughly speaking,
standard errors are proportional
to the square roots of the weights. Negative weights are not
permitted.
The default rule is now wtype="value"
but previous versions
of relrisk.ppp
(in spatstat.explore versions
3.1-0
and earlier) effectively used wtype="multiplicity"
.
Adrian Baddeley Adrian.Baddeley@curtin.edu.au and Rolf Turner rolfturner@posteo.net.
The command relrisk
is generic and can be used to
estimate relative risk in different ways.
This function relrisk.ppp
is the method for point pattern
datasets. It computes nonparametric estimates of relative risk
by kernel smoothing (Bithell, 1990, 1991; Diggle, 2003; Baddeley,
Rubak and Turner, 2015).
If X
is a bivariate point pattern
(a multitype point pattern consisting of two types of points)
then by default,
the points of the first type (the first level of marks(X)
)
are treated as controls or non-events, and points of the second type
are treated as cases or events. Then by default this command computes
the spatially-varying probability of a case,
i.e. the probability \(p(u)\)
that a point at spatial location \(u\)
will be a case. If relative=TRUE
, it computes the
spatially-varying relative risk of a case relative to a
control, \(r(u) = p(u)/(1- p(u))\).
If X
is a multitype point pattern with \(m > 2\) types,
or if X
is a bivariate point pattern
and casecontrol=FALSE
,
then by default this command computes, for each type \(j\),
a nonparametric estimate of
the spatially-varying probability of an event of type \(j\).
This is the probability \(p_j(u)\)
that a point at spatial location \(u\)
will belong to type \(j\).
If relative=TRUE
, the command computes the
relative risk of an event of type \(j\)
relative to a control,
\(r_j(u) = p_j(u)/p_k(u)\),
where events of type \(k\) are treated as controls.
The argument control
determines which type \(k\)
is treated as a control.
If at = "pixels"
the calculation is performed for
every spatial location \(u\) on a fine pixel grid, and the result
is a pixel image representing the function \(p(u)\)
or a list of pixel images representing the functions
\(p_j(u)\) or \(r_j(u)\)
for \(j = 1,\ldots,m\).
An infinite value of relative risk (arising because the
probability of a control is zero) will be returned as NA
.
If at = "points"
the calculation is performed
only at the data points \(x_i\). By default
the result is a vector of values
\(p(x_i)\) giving the estimated probability of a case
at each data point, or a matrix of values
\(p_j(x_i)\) giving the estimated probability of
each possible type \(j\) at each data point.
If relative=TRUE
then the relative risks
\(r(x_i)\) or \(r_j(x_i)\) are
returned.
An infinite value of relative risk (arising because the
probability of a control is zero) will be returned as Inf
.
Estimation is performed by a simple Nadaraja-Watson type kernel smoother (Bithell, 1990, 1991; Diggle, 2003; Baddeley, Rubak and Turner, 2015, section 14.4). The smoothing bandwidth can be specified in any of the following ways:
sigma
is a single numeric value, giving the standard
deviation of the isotropic Gaussian kernel.
sigma
is a numeric vector of length 2, giving the
standard deviations in the \(x\) and \(y\) directions of
a Gaussian kernel.
varcov
is a 2 by 2 matrix giving the
variance-covariance matrix of the Gaussian kernel.
sigma
is a function
which selects
the bandwidth.
Bandwidth selection will be applied
separately to each type of point.
An example of such a function is bw.diggle
.
sigma
and varcov
are both missing or null. Then a common
smoothing bandwidth sigma
will be selected by cross-validation using bw.relrisk
.
An infinite smoothing bandwidth, sigma=Inf
, is permitted
and yields a constant estimate of relative risk.
If se=TRUE
then standard errors will also be computed,
based on asymptotic theory, assuming a Poisson process.
The optional argument weights
may provide numerical weights
for the points of X
. It should be a numeric vector of length
equal to npoints(X)
.
The argument weights
can also be an expression
.
It will be evaluated in the data frame as.data.frame(X)
to obtain a vector of weights. The expression may involve
the symbols x
and y
representing the Cartesian
coordinates, and the symbol marks
representing the mark values.
The argument weights
can also be a pixel image
(object of class "im"
). numerical weights for the data points
will be extracted from this image (by looking up the pixel values
at the locations of the data points in X
).
Baddeley, A., Rubak, E. and Turner, R. (2015) Spatial Point Patterns: Methodology and Applications with R. Chapman and Hall/CRC Press.
Bithell, J.F. (1990) An application of density estimation to geographical epidemiology. Statistics in Medicine 9, 691--701.
Bithell, J.F. (1991) Estimation of relative risk functions. Statistics in Medicine 10, 1745--1751.
Diggle, P.J. (2003) Statistical analysis of spatial point patterns, Second edition. Arnold.
Diggle, P.J., Zheng, P. and Durr, P. (2005) Non-parametric estimation of spatial segregation in a multivariate point process: bovine tuberculosis in Cornwall, UK. Applied Statistics 54, 645--658.
There is another method relrisk.ppm
for point process
models which computes parametric
estimates of relative risk, using the fitted model.
See also
bw.relrisk
,
density.ppp
,
Smooth.ppp
,
eval.im
p.oak <- relrisk(urkiola, 20)
if(interactive()) {
plot(p.oak, main="proportion of oak")
plot(eval.im(p.oak > 0.3), main="More than 30 percent oak")
plot(split(lansing), main="Lansing Woods")
p.lan <- relrisk(lansing, 0.05, se=TRUE)
plot(p.lan$estimate, main="Lansing Woods species probability")
plot(p.lan$SE, main="Lansing Woods standard error")
wh <- im.apply(p.lan$estimate, which.max)
types <- levels(marks(lansing))
wh <- eval.im(types[wh])
plot(wh, main="Most common species")
}
Run the code above in your browser using DataLab