pppdist(X, Y, type = "spa", cutoff = 1, q = 1, matching = TRUE,
ccode = TRUE, precision = NULL, approximation = 10,
show.rprimal = FALSE, timelag = 0)
"ppp"
)."spa"
(default), "ace"
or "mat"
, indicating
whether the algorithm should find the optimal matching based on "subpattern
assignment", "Inf
, in which case the maximum of the interpoint distances is taken.FALSE
, Rcode is used which allows for higher precision, but is
much slower.q
-th powers of interpoint distances
will be rounded to the nearest multiple of 10^(-precision)
. There is a sensible default which depends on ccode
.q = Inf
, compute distance based on the optimal matching for the
corresponding distance of order approximation
. Can be Inf
, but
this makes computations extremely slow.pppmatching
that contains detailed
information about the parameters used and the resulting distance.
See pppmatching.object
for details.
If matching = FALSE
, only the numerical value of the distance
is returned.X
and Y
based
on finding the matching between them
which minimizes the average of the distances between matched points
(if q=1
), the maximum distance between matched points
(if q=Inf
), and in general the q
-th order average
(i.e. the 1/q
th power of the sum of
the q
th powers) of the distances between matched points.
Distances between matched points are Euclidean distances cut off at
the value of cutoff
. The parameter type
controls the behaviour of the algorithm if
the cardinalities of the point patterns are different. For the type "spa"
(subpattern assignment) the subpattern of the point pattern
with the larger cardinality $n$ that is closest to the point pattern
with the smaller cardinality $m$ is determined; then the q
-th order
average is taken over $n$ values: the $m$ distances of matched points
and $n-m$ "penalty distances" of value cutoff
for
the unmatched points. For the type "ace"
(assignment only if
cardinalities equal) the matching is empty and the distance returned is equal
to cutoff
if the cardinalities differ. For the
type "mat"
(mass transfer) each point pattern is assumed
to have total mass $m$ (= the smaller cardinality) distributed evenly
among its points; the algorithm finds then the "mass transfer plan" that
minimizes the q
-th order weighted average of the distances, where
the weights are given by the transferred mass divided by $m$. The result is a fractional matching (each match of two points has a weight in $(0,1]$)
with the minimized quantity as the associated distance.
The computations for all three types rely heavily on a specialized
primal-dual algorithm (described in Luenberger (2003), Section 5.9)
for Hitchcock's problem of optimal transport of a product from a number
of suppliers to a number of (e.g. vending) locations. The C implementation
used by default can handle patterns with a few hundreds of points, but
should not be used with thousands of points. By setting show.rprimal = TRUE
,
some insight in the working of the algorithm can be gained.
For moderate and large values of q
there
can be numerical issues based on the fact that the q
-th powers of
distances are taken and some positive values enter the optimization algorithm
as zeroes because they are too small in comparison with the larger values.
In this case the number of zeroes introduced is given in a warning message,
and it is possible then that the matching obtained is not optimal and the associated
distance is only a strict upper bound of the true distance.
As a general guideline (which can be very wrong in special situations) a small
number of zeroes (up to about 50 percent of the smaller point pattern
cardinality $m$)
usually still results in the right matching, and the number can even be quite a bit
higher and usually still provides a highly accurate upper bound for the distance.
These numerical
problems can be reduced by enforcing (much slower) Rcode via the
argument ccode = FALSE
.
For q = Inf
there is no fast algorithm available, which is why approximation is
normally used: for finding the optimal matching, q
is
set to the value of approximation
. The
resulting distance is still given as the maximum rather than the
q
-th order average in the corresponding distance computation.
If approximation = Inf
, approximation is suppressed and a very inefficient
exhaustive search for the best matching is performed.
The value of precision
should normally not be supplied by the user. If
ccode = TRUE
, this value is preset to the highest exponent of 10 that
the C code still can handle (usually $9$). If ccode = FALSE
, the value is
preset according to q
(usually $15$ if q
is small),
which can sometimes be changed to obtain less severe warning messages.
Luenberger D.G. (2003). Linear and nonlinear programming. Second edition. Kluwer.
Schuhmacher, D. and Xia, A. (2008) A new metric between distributions of point processes. Advances in Applied Probability 40, 651--672
Schuhmacher, D., Vo, B.-T. and Vo, B.-N. (2008) A consistent metric for performance evaluation of multi-object filters. IEEE Transactions on Signal Processing 56, 3447--3457.
pppmatching.object
matchingdist
# equal cardinalities
X <- runifpoint(100)
Y <- runifpoint(100)
m <- pppdist(X, Y)
m
plot(m)
# differing cardinalities
X <- runifpoint(14)
Y <- runifpoint(10)
m1 <- pppdist(X, Y, type="spa")
m2 <- pppdist(X, Y, type="ace")
m3 <- pppdist(X, Y, type="mat")
summary(m1)
summary(m2)
summary(m3)
m1$matrix
m2$matrix
m3$matrix
# q = Inf
X <- runifpoint(10)
Y <- runifpoint(10)
mx1 <- pppdist(X, Y, q=Inf)$matrix
mx2 <- pppdist(X, Y, q=Inf, ccode=FALSE, approximation=50)$matrix
mx3 <- pppdist(X, Y, q=Inf, approximation=Inf)$matrix
((mx1 == mx2) && (mx2 == mx3))
# TRUE if approximations are good
Run the code above in your browser using DataLab