The envelope
command performs simulations and
computes envelopes of a summary statistic based on the simulations.
The result is an object that can be plotted to display the envelopes.
The envelopes can be used to assess the goodness-of-fit of
a point process model to point pattern data.
For the most basic use, if you have a point pattern X
and
you want to test Complete Spatial Randomness (CSR), type
plot(envelope(X, Kest,nsim=39))
to see the \(K\) function
for X
plotted together with the envelopes of the
\(K\) function for 39 simulations of CSR.
The envelope
function is generic, with methods for
the classes "ppp"
, "ppm"
and "kppm"
described here. There are also methods for the classes "pp3"
,
"lpp"
and "lppm"
which are described separately
under envelope.pp3
and envelope.lpp
.
Envelopes can also be computed from other envelopes, using
envelope.envelope
.
To create simulation envelopes, the command envelope(Y, ...)
first generates nsim
random point patterns
in one of the following ways.
If Y
is a point pattern (an object of class "ppp"
)
and simulate=NULL
,
then we generate nsim
simulations of
Complete Spatial Randomness (i.e. nsim
simulated point patterns
each being a realisation of the uniform Poisson point process)
with the same intensity as the pattern Y
.
(If Y
is a multitype point pattern, then the simulated patterns
are also given independent random marks; the probability
distribution of the random marks is determined by the
relative frequencies of marks in Y
.)
If Y
is a fitted point process model (an object of class
"ppm"
or "kppm"
) and simulate=NULL
,
then this routine generates nsim
simulated
realisations of that model.
If simulate
is supplied, then it determines how the
simulated point patterns are generated. It may be either
an expression in the R language, typically containing a call
to a random generator. This expression will be evaluated
nsim
times to yield nsim
point patterns. For example
if simulate=expression(runifpoint(100))
then each simulated
pattern consists of exactly 100 independent uniform random points.
a function in the R language, typically containing a call to a
random generator. This function will be applied repeatedly
to the original data pattern Y
to yield nsim
point
patterns. For example if simulate=rlabel
then each
simulated pattern was generated by evaluating rlabel(Y)
and consists of a randomly-relabelled version of Y
.
a list of point patterns.
The entries in this list will be taken as the simulated patterns.
an object of class "envelope"
. This should have been
produced by calling envelope
with the
argument savepatterns=TRUE
.
The simulated point patterns that were saved in this object
will be extracted and used as the simulated patterns for the
new envelope computation. This makes it possible to plot envelopes
for two different summary functions based on exactly the same set of
simulated point patterns.
The summary statistic fun
is applied to each of these simulated
patterns. Typically fun
is one of the functions
Kest
, Gest
, Fest
, Jest
, pcf
,
Kcross
, Kdot
, Gcross
, Gdot
,
Jcross
, Jdot
, Kmulti
, Gmulti
,
Jmulti
or Kinhom
. It may also be a character string
containing the name of one of these functions.
The statistic fun
can also be a user-supplied function;
if so, then it must have arguments X
and r
like those in the functions listed above, and it must return an object
of class "fv"
.
Upper and lower critical envelopes are computed in one of the following ways:
- pointwise:
by default, envelopes are calculated pointwise
(i.e. for each value of the distance argument \(r\)), by sorting the
nsim
simulated values, and taking the m
-th lowest
and m
-th highest values, where m = nrank
.
For example if nrank=1
, the upper and lower envelopes
are the pointwise maximum and minimum of the simulated values.
The pointwise envelopes are not “confidence bands”
for the true value of the function! Rather,
they specify the critical points for a Monte Carlo test
(Ripley, 1981). The test is constructed by choosing a
fixed value of \(r\), and rejecting the null hypothesis if the
observed function value
lies outside the envelope at this value of \(r\).
This test has exact significance level
alpha = 2 * nrank/(1 + nsim)
.
- simultaneous:
if global=TRUE
, then the envelopes are
determined as follows. First we calculate the theoretical mean value of
the summary statistic (if we are testing CSR, the theoretical
value is supplied by fun
; otherwise we perform a separate
set of nsim2
simulations, compute the
average of all these simulated values, and take this average
as an estimate of the theoretical mean value). Then, for each simulation,
we compare the simulated curve to the theoretical curve, and compute the
maximum absolute difference between them (over the interval
of \(r\) values specified by ginterval
). This gives a
deviation value \(d_i\) for each of the nsim
simulations. Finally we take the m
-th largest of the
deviation values, where m=nrank
, and call this
dcrit
. Then the simultaneous envelopes are of the form
lo = expected - dcrit
and hi = expected + dcrit
where
expected
is either the theoretical mean value theo
(if we are testing CSR) or the estimated theoretical value
mmean
(if we are testing another model). The simultaneous critical
envelopes have constant width 2 * dcrit
.
The simultaneous critical envelopes allow us to perform a different
Monte Carlo test (Ripley, 1981). The test rejects the null
hypothesis if the graph of the observed function
lies outside the envelope at any value of \(r\).
This test has exact significance level
alpha = nrank/(1 + nsim)
.
This test can also be performed using mad.test
.
- based on sample moments:
if VARIANCE=TRUE
,
the algorithm calculates the
(pointwise) sample mean and sample variance of
the simulated functions. Then the envelopes are computed
as mean plus or minus nSD
standard deviations.
These envelopes do not have an exact significance interpretation.
They are a naive approximation to
the critical points of the Neyman-Pearson test
assuming the summary statistic is approximately Normally
distributed.
The return value is an object of class "fv"
containing
the summary function for the data point pattern,
the upper and lower simulation envelopes, and
the theoretical expected value (exact or estimated) of the summary function
for the model being tested. It can be plotted
using plot.envelope
.
If VARIANCE=TRUE
then the return value also includes the
sample mean, sample variance and other quantities.
Arguments can be passed to the function fun
through
...
. This means that you simply specify these arguments in the call to
envelope
, and they will be passed to fun
.
In particular, the argument correction
determines the edge correction to be used to calculate the summary
statistic. See the section on Edge Corrections, and the Examples.
Arguments can also be passed to the function fun
through the list funargs
. This mechanism is typically used if
an argument of fun
has the same name as an argument of
envelope
. The list funargs
should contain
entries of the form name=value
, where each name
is the name
of an argument of fun
.
There is also an option, rarely used, in which different function
arguments are used when computing the summary function
for the data Y
and for the simulated patterns.
If funYargs
is given, it will be used
when the summary function for the data Y
is computed,
while funargs
will be used when computing the summary function
for the simulated patterns.
This option is only needed in rare cases: usually the basic principle
requires that the data and simulated patterns must be treated
equally, so that funargs
and funYargs
should be identical.
If Y
is a fitted cluster point process model (object of
class "kppm"
), and simulate=NULL
,
then the model is simulated directly
using simulate.kppm
.
If Y
is a fitted Gibbs point process model (object of
class "ppm"
), and simulate=NULL
,
then the model is simulated
by running the Metropolis-Hastings algorithm rmh
.
Complete control over this algorithm is provided by the
arguments start
and control
which are passed
to rmh
.
For simultaneous critical envelopes (global=TRUE
)
the following options are also useful:
ginterval
determines the interval of \(r\) values
over which the deviation between curves is calculated.
It should be a numeric vector of length 2.
There is a sensible default (namely, the recommended plotting
interval for fun(X)
, or the range of r
values if
r
is explicitly specified).
transform
specifies a transformation of the
summary function fun
that will be carried out before the
deviations are computed.
Such transforms are useful if global=TRUE
or
VARIANCE=TRUE
.
The transform
must be an expression object
using the symbol .
to represent the function value
(and possibly other symbols recognised by with.fv
).
For example,
the conventional way to normalise the \(K\) function
(Ripley, 1981) is to transform it to the \(L\) function
\(L(r) = \sqrt{K(r)/\pi}\)
and this is implemented by setting
transform=expression(sqrt(./pi))
.
It is also possible to extract the summary functions for each of the
individual simulated point patterns, by setting savefuns=TRUE
.
Then the return value also
has an attribute "simfuns"
containing all the
summary functions for the individual simulated patterns.
It is an "fv"
object containing
functions named sim1, sim2, ...
representing the nsim
summary functions.
It is also possible to save the simulated point patterns themselves,
by setting savepatterns=TRUE
. Then the return value also has
an attribute "simpatterns"
which is a list of length
nsim
containing all the simulated point patterns.
See plot.envelope
and plot.fv
for information about how to plot the envelopes.
Different envelopes can be recomputed from the same data
using envelope.envelope
.
Envelopes can be combined using pool.envelope
.