mhat: Estimation of the m function

Description

Estimates the m function

Usage

mhat(X, r = NULL, ReferenceType, NeighborType = ReferenceType, CaseControl = FALSE, Original = TRUE, Approximate = ifelse(X$n < 10000, 0, 1), Adjust = 1, MaxRange = "ThirdW", CheckArguments = TRUE)

Arguments

A weighted, marked planar point pattern (wmppp.object) or a Dtable object.

A vector of distances. If NULL, a default value is set: 64 unequally spaced values are used up to half the maximum distance between points $d_m$. The first value is 0, first steps are small ($d_m/800$) then increase progressively up to $d_m/40$.

ReferenceType

One of the point types.

NeighborType

One of the point types. By default, the same as reference type.

CaseControl

Logical; if TRUE, the case-control version of M is computed. ReferenceType points are cases, NeighborType points are controls.

CheckArguments

Logical; if TRUE, the function arguments are verified. Should be set to FALSE to save time in simulations for example, when the arguments have been checked elsewhere.

A weighted, marked planar point pattern (wmppp.object) or a Dtable object.

A vector of distances. If NULL, a default value is set: 512 equally spaced values are used, from the smallest distance to the range defined by MaxRange. the between points to half the diameter of the window.

ReferenceType

One of the point types.

NeighborType

One of the point types. By default, the same as reference type.

CaseControl

Logical; if TRUE, the case-control version of M is computed. ReferenceType points are cases, NeighborType points are controls.

Original

Logical; if TRUE (by default), the original bandwidth selection by Duranton and Overman (2005) following Silverman (1986: eq 3.31) is used. If FALSE, it is calculated following Sheather and Jones (1991), i.e. the state of the art. See bw.SJ for more details.

Approximate

if not 0 (1 is a good choice), exact distances between pairs of points are rounded to 1024 times Approximate single values equally spaced between 0 and the largest distance. This technique (Scholl and Brenner, 2015) allows saving a lot of memory when addressing large point sets (the default value is 1 over 10000 points). Increasing Approximate allows better precision at the cost of proportional memory use. Ignored if X is a Dtable object.

Adjust

Force the automatically selected bandwidth (following Original) to be multiplied by Adjust. Setting it to values lower than one (1/2 for example) will sharpen the estimation.

MaxRange

The maximum value of r to consider, ignored if r is not NULL. Default is "ThirdW", one third of the diameter of the window. Other choices are "HalfW", and "QuarterW" and "D02005". "HalfW", and "QuarterW" are for half or the quarter of the diameter of the window. "D02005" is for the median distance observed between points, following Duranton and Overman (2005). "ThirdW" should be close to "DO2005" but has the advantage to be independent of the point types chosen as ReferenceType and NeighborType, to simplify comparisons between different types. "D02005" is approximated by "ThirdW" if Approximate is not 0. if X is a Dtable object, the diameter of the window is taken as the max distance between points.

CheckArguments

Logical; if TRUE, the function arguments are verified. Should be set to FALSE to save time in simulations for example, when the arguments have been checked elsewhere.

Value

An object of class fv, see fv.object, which can be plotted directly using plot.fv.

Details

m is a weighted, density, relative measure of a point pattern structure (Lang et al., 2014). Its value at any distance is the ratio of neighbors of the NeighborType to all points around ReferenceType points, normalized by its value over the windows. The number of neighbors at each distance is estimated by a Gaussian kernel whose bandwith is chosen optimally according to Silverman (1986: eq 3.31). It can be sharpened or smoothed by multiplying it by Adjust. The bandwidth of Sheather and Jones (1991) would be better but it is very slow to calculate for large point patterns and it sometimes fails. It is often sharper than that of Silverman. If X is not a Dtable object, the maximum value of r is obtained from the geometry of the window rather than caculating the median distance between points as suggested by Duranton and Overman (2005) to save (a lot of) calculation time.

References

Duranton, G. and Overman, H. G. (2005). Testing for Localisation Using Micro-Geographic Data. Review of Economic Studies 72(4): 1077-1106.

Lang G., Marcon E. and Puech F. (2014) Distance-Based Measures of Spatial Concentration: Introducing a Relative Density Function. HAL 01082178, 1-18. Scholl, T. and Brenner, T. (2015) Optimizing distance-based methods for large data sets, Journal of Geographical Systems 17(4): 333-351.

Sheather, S. J. and Jones, M. C. (1991) A reliable data-based bandwidth selection method for kernel density estimation. Journal of the Royal Statistical Society series B, 53, 683-690.

Silverman, B. W. (1986). Density estimation for statistics and data analysis. Chapman and Hall, London.

Examples

Run this code


data(paracou16)
plot(paracou16)

# Calculate M
plot(mhat(paracou16, , "V. Americana", "Q. Rosea"))

Run the code above in your browser using DataLab