dclf.test: Diggle-Cressie-Loosmore-Ford and Maximum Absolute Deviation Tests

Description

Perform the Diggle (1986) / Cressie (1991) / Loosmore and Ford (2006) test or the Maximum Absolute Deviation test for a spatial point pattern.

Usage

dclf.test(X, ..., alternative=c("two.sided", "less", "greater"),
                  rinterval = NULL, leaveout=1,
                  scale=NULL, clamp=FALSE, interpolate=FALSE)
mad.test(X, ...,  alternative=c("two.sided", "less", "greater"),
                  rinterval = NULL, leaveout=1,
                  scale=NULL, clamp=FALSE, interpolate=FALSE)

Value

An object of class "htest". Printing this object gives a report on the result of the test. The \(p\)-value is contained in the component p.value.

Arguments

X: Data for the test. Either a point pattern (object of class "ppp", "lpp" or other class), a fitted point process model (object of class "ppm", "kppm" or other class), a simulation envelope (object of class "envelope") or a previous result of dclf.test or mad.test.
...: Arguments passed to envelope. Useful arguments include fun to determine the summary function, nsim to specify the number of Monte Carlo simulations, verbose=FALSE to turn off the messages, savefuns or savepatterns to save the simulation results, and use.theory described under Details.
alternative: The alternative hypothesis. A character string. The default is a two-sided alternative. See Details.
rinterval: Interval of values of the summary function argument r over which the maximum absolute deviation, or the integral, will be computed for the test. A numeric vector of length 2.
leaveout: Optional integer 0, 1 or 2 indicating how to calculate the deviation between the observed summary function and the nominal reference value, when the reference value must be estimated by simulation. See Details.
scale: Optional. A function in the R language which determines the relative scale of deviations, as a function of distance \(r\). Summary function values for distance r will be divided by scale(r) before the test statistic is computed.
clamp: Logical value indicating how to compute deviations in a one-sided test. Deviations of the observed summary function from the theoretical summary function are initially evaluated as signed real numbers, with large positive values indicating consistency with the alternative hypothesis. If clamp=FALSE (the default), these values are not changed. If clamp=TRUE, any negative values are replaced by zero.
interpolate: Logical value specifying whether to calculate the \(p\)-value by interpolation. If interpolate=FALSE (the default), a standard Monte Carlo test is performed, yielding a \(p\)-value of the form \((k+1)/(n+1)\) where \(n\) is the number of simulations and \(k\) is the number of simulated values which are more extreme than the observed value. If interpolate=TRUE, the \(p\)-value is calculated by applying kernel density estimation to the simulated values, and computing the tail probability for this estimated distribution.

Handling Ties

If the observed value of the test statistic is equal to one or more of the simulated values (called a tied value), then the tied values will be assigned a random ordering, and a message will be printed.

Author

Adrian Baddeley Adrian.Baddeley@curtin.edu.au, Andrew Hardegen and Suman Rakshit.

Details

These functions perform hypothesis tests for goodness-of-fit of a point pattern dataset to a point process model, based on Monte Carlo simulation from the model.

dclf.test performs the test advocated by Loosmore and Ford (2006) which is also described in Diggle (1986), Cressie (1991, page 667, equation (8.5.42)) and Diggle (2003, page 14). See Baddeley et al (2014) for detailed discussion.

mad.test performs the ‘global’ or ‘Maximum Absolute Deviation’ test described by Ripley (1977, 1981). See Baddeley et al (2014).

The type of test depends on the type of argument X.

If X is some kind of point pattern, then a test of Complete Spatial Randomness (CSR) will be performed. That is, the null hypothesis is that the point pattern is completely random.
If X is a fitted point process model, then a test of goodness-of-fit for the fitted model will be performed. The model object contains the data point pattern to which it was originally fitted. The null hypothesis is that the data point pattern is a realisation of the model.
If X is an envelope object generated by envelope, then it should have been generated with savefuns=TRUE or savepatterns=TRUE so that it contains simulation results. These simulations will be treated as realisations from the null hypothesis.
Alternatively X could be a previously-performed test of the same kind (i.e. the result of calling dclf.test or mad.test). The simulations used to perform the original test will be re-used to perform the new test (provided these simulations were saved in the original test, by setting savefuns=TRUE or savepatterns=TRUE).

The argument alternative specifies the alternative hypothesis, that is, the direction of deviation that will be considered statistically significant. If alternative="two.sided" (the default), both positive and negative deviations (between the observed summary function and the theoretical function) are significant. If alternative="less", then only negative deviations (where the observed summary function is lower than the theoretical function) are considered. If alternative="greater", then only positive deviations (where the observed summary function is higher than the theoretical function) are considered.

In all cases, the algorithm will first call envelope to generate or extract the simulated summary functions. The number of simulations that will be generated or extracted, is determined by the argument nsim, and defaults to 99. The summary function that will be computed is determined by the argument fun (or the first unnamed argument in the list ...) and defaults to Kest (except when X is an envelope object generated with savefuns=TRUE, when these functions will be taken).

The choice of summary function fun affects the power of the test. It is normally recommended to apply a variance-stabilising transformation (Ripley, 1981). If you are using the \(K\) function, the normal practice is to replace this by the \(L\) function (Besag, 1977) computed by Lest. If you are using the \(F\) or \(G\) functions, the recommended practice is to apply Fisher's variance-stabilising transformation \(\sin^{-1}\sqrt x\) using the argument transform. See the Examples.

The argument rinterval specifies the interval of distance values \(r\) which will contribute to the test statistic (either maximising over this range of values for mad.test, or integrating over this range of values for dclf.test). This affects the power of the test. General advice and experiments in Baddeley et al (2014) suggest that the maximum \(r\) value should be slightly larger than the maximum possible range of interaction between points. The dclf.test is quite sensitive to this choice, while the mad.test is relatively insensitive.

It is also possible to specify a pointwise test (i.e. taking a single, fixed value of distance \(r\)) by specifing rinterval = c(r,r).

The argument use.theory passed to envelope determines whether to compare the summary function for the data to its theoretical value for CSR (use.theory=TRUE) or to the sample mean of simulations from CSR (use.theory=FALSE). The test statistic \(T\) is defined in equations (10.21) and (10.22) respectively on page 394 of Baddeley, Rubak and Turner (2015).

The argument leaveout specifies how to calculate the discrepancy between the summary function for the data and the nominal reference value, when the reference value must be estimated by simulation. The values leaveout=0 and leaveout=1 are both algebraically equivalent (Baddeley et al, 2014, Appendix) to computing the difference observed - reference where the reference is the mean of simulated values. The value leaveout=2 gives the leave-two-out discrepancy proposed by Dao and Genton (2014).

References

Baddeley, A., Diggle, P.J., Hardegen, A., Lawrence, T., Milne, R.K. and Nair, G. (2014) On tests of spatial pattern based on simulation envelopes. Ecological Monographs 84(3) 477--489.

Baddeley, A., Rubak, E. and Turner, R. (2015) Spatial Point Patterns: Methodology and Applications with R. Chapman and Hall/CRC Press.

Besag, J. (1977) Discussion of Dr Ripley's paper. Journal of the Royal Statistical Society, Series B, 39, 193--195.

Cressie, N.A.C. (1991) Statistics for spatial data. John Wiley and Sons, 1991.

Dao, N.A. and Genton, M. (2014) A Monte Carlo adjusted goodness-of-fit test for parametric models describing spatial point patterns. Journal of Graphical and Computational Statistics 23, 497--517.

Diggle, P. J. (1986). Displaced amacrine cells in the retina of a rabbit : analysis of a bivariate spatial point pattern. J. Neuroscience Methods 18, 115--125.

Diggle, P.J. (2003) Statistical analysis of spatial point patterns, Second edition. Arnold.

Loosmore, N.B. and Ford, E.D. (2006) Statistical inference using the G or K point pattern spatial statistics. Ecology 87, 1925--1931.

Ripley, B.D. (1977) Modelling spatial patterns (with discussion). Journal of the Royal Statistical Society, Series B, 39, 172 -- 212.

Ripley, B.D. (1981) Spatial statistics. John Wiley and Sons.

Examples

Run this code

  dclf.test(cells, Lest, nsim=39)
  m <- mad.test(cells, Lest, verbose=FALSE, rinterval=c(0, 0.1), nsim=19)
  m
  # extract the p-value
  m$p.value
  # variance stabilised G function
  dclf.test(cells, Gest, transform=expression(asin(sqrt(.))),
                   verbose=FALSE, nsim=19)

  ## one-sided test
  ml <- mad.test(cells, Lest, verbose=FALSE, nsim=19, alternative="less")

  ## scaled
  mad.test(cells, Kest, verbose=FALSE, nsim=19,
           rinterval=c(0.05, 0.2),
           scale=function(r) { r })

Run the code above in your browser using DataLab