stepacross
tries to replace dissimilarities with
shortest paths stepping across intermediate
sites while regarding dissimilarities above a threshold as missing
data (NA
). With path = "shortest"
this is the flexible shortest
path (Williamson 1978, Bradfield & Kenkel 1987),
and with path = "extended"
an
approximation known as extended dissimilarities (De'ath 1999).
The use of stepacross
should improve the ordination with high
beta diversity, when there are many sites with no species in common.
stepacross(dis, path = "shortest", toolong = 1, trace = TRUE, ...)
"shortest"
finds the shortest paths, and
"extended"
their approximation known as extended
dissimilarities.NA
.
The function uses a fuzz factor, so
that dissimilarities close to the limit will be made NA
, too. path = "shortest"
function stepacross
replaces dissimilarities that are
toolong
or longer with NA
, and tries to find shortest
paths between all sites using remaining dissimilarities. Several
dissimilarity indices are semi-metric which means that they do not
obey the triangle inequality $d[ij] <= d[ik]="" +="" d[kj]$,="" and="" shortest="" path="" algorithm="" can="" replace="" these="" dissimilarities="" as="" well,="" even="" when="" they="" are="" shorter="" than="" toolong. De'ath (1999) suggested a simplified method known as extended
dissimilarities, which are calculated with path = "extended"
.
In this method, dissimilarities that are
toolong
or longer are first made NA
, and then the function
tries to replace these NA
dissimilarities with a path through
single stepping stone points. If not all NA
could be
replaced with one pass, the function will make new passes with updated
dissimilarities as long as
all NA
are replaced with extended dissimilarities. This mean
that in the second and further passes, the remaining NA
dissimilarities are allowed to have more than one stepping stone site,
but previously replaced dissimilarities are not updated. Further, the
function does not consider dissimilarities shorter than toolong
,
although some of these could be replaced with a shorter path in
semi-metric indices, and used as a part of other paths. In optimal
cases, the extended dissimilarities are equal to shortest paths, but
they may be longer.
As an alternative to defining too long dissimilarities with parameter
toolong
, the input dissimilarities can contain NA
s. If
toolong
is zero or negative, the function does not make any
dissimilarities into NA
. If there are no NA
s in the
input and toolong = 0
, path = "shortest"
will find shorter paths for semi-metric indices, and path = "extended"
will do nothing. Function no.shared
can be
used to set dissimilarities to NA
.
If the data are disconnected or there is no path between all points,
the result will
contain NA
s and a warning is issued. Several methods cannot
handle NA
dissimilarities, and this warning should be taken
seriously. Function distconnected
can be used to find
connected groups and remove rare outlier observations or groups of
observations.
Alternative path = "shortest"
uses Dijkstra's method for
finding flexible shortest paths, implemented as priority-first search
for dense graphs (Sedgewick 1990). Alternative path = "extended"
follows De'ath (1999), but implementation is simpler
than in his code.
=>
Sedgewick, R. (1990). Algorithms in C. Addison Wesley.
Williamson, M.H. (1978). The ordination of incidence data. J. Ecol. 66, 911-920.
distconnected
can find connected groups in
disconnected data, and function no.shared
can be used to
set dissimilarities as NA
. See swan
for an
alternative approach. Function stepacross
is an essential
component in isomap
and cophenetic.spantree
.
# There are no data sets with high beta diversity in vegan, but this
# should give an idea.
data(dune)
dis <- vegdist(dune)
edis <- stepacross(dis)
plot(edis, dis, xlab = "Shortest path", ylab = "Original")
## Manhattan distance have no fixed upper limit.
dis <- vegdist(dune, "manhattan")
is.na(dis) <- no.shared(dune)
dis <- stepacross(dis, toolong=0)
Run the code above in your browser using DataLab