Learn R Programming

vegan (version 1.6-0)

distconnected: Connectedness and Minimum Spanning Tree for Dissimilarities

Description

Function distconnected finds groups that are connected disregarding dissimilarities that are at or above a threshold or NA. The function can be used to find groups that can be ordinated together or transformed by stepacross. Function no.shared returns a logical dissimilarity object, where TRUE means that sites have no species in common. This is a minimal structure for distconnected or can be used to set missing values to dissimilarities corresponding to no shared species. Function spantree finds a minimum spanning tree connecting all points, but disregarding dissimilarities that are at or above the threshold or NA.

Usage

distconnected(dis, toolong = 1, trace = TRUE)
no.shared(x)
spantree(dis, toolong = 1)

Arguments

dis
Dissimilarity data inheriting from class dist or a an object, such as a matrix, that can be converted to a dissimilarity matrix. Functions vegdist and
toolong
Shortest dissimilarity regarded as NA. The function uses a fuzz factor, so that dissimilarities close to the limit will be made NA, too.
trace
Summarize results of distconnected
x
Community data.

Value

  • Function distconnected returns a vector for observations using integers to identify connected groups. If the data are connected, values will be all 1. Function no.shared returns an object of class dist. Function spantree returns a list with two vectors, each of length $n-1$. The number of links in a tree is one less the number of observations, and the first item is omitted. The items are
  • kidThe child node of the parent, starting from parent number two. If there is no link from the parent, value will be NA and tree is disconnected at the node.
  • distCorresponding distance. If kid = NA, then dist = 0.

Details

Data sets are disconnected if they have sample plots or groups of sample plots which share no species with other sites or groups of sites. Such data sets cannot be sensibly ordinated by any unconstrained method, because these subsets cannot be related to each other. For instance, correspondence analysis will polarize these subsets with eigenvalue 1. Neither can such dissimilarities be transformed with stepacross, because there is no path between all points, and result will contain NAs. Function distconnected will find such subsets in dissimilarity matrices. The function will return a grouping vector that can be used for subsetting the data. If data are connected, the result vector will be all $1$s. The connectedness between two points can be defined either by a threshold toolong or using input dissimilarities with NAs. If toolong is zero or negative, no threshold will be used.

Function no.shared returns a dist structure having value TRUE when two sites have nothing in common, and value FALSE when they have at least one shared species. This is a minimal structure that can be analysed with distconnected. The function can be used to select dissimilarities with no shared species in indices which do not have a fixed upper limit. Function spantree finds a minimum spanning tree for dissimilarities (there may be several minimum spanning trees, but the function finds only one). Dissimilarities at or above the threshold toolong and NAs are disregarded, and the spanning tree is found through other dissimilarities. If the data are disconnected, the function will return a disconnected tree (or a forest), and the corresponding link is NA. The results of spantree can be overlaid onto an ordination diagram using function ordispantree.

Function distconnected uses depth-first search (Sedgewick 1990). Function spantree uses Prim's method implemented as priority-first search for dense graphs (Sedgewick 1990).

References

Sedgewick, R. (1990). Algorithms in C. Addison Wesley.

See Also

vegdist or dist for getting dissimilarities, stepacross for a case where you may need distconnected, ordispantree for displaying results of spantree, and hclust or agnes for single linkage clustering.

Examples

Run this code
## There are no disconnected data in vegan, and the following uses an
## extremely low threshold limit for connectedness. This is for
## illustration only, and not a recommended practise.
data(dune)
dis <- vegdist(dune)
ord <- cmdscale(dis) ## metric MDS
gr <- distconnected(dis, toolong=0.4)
tr <- spantree(dis, toolong=0.4)
ordiplot(ord, type="n")
ordispantree(ord, tr, col="red", lwd=2)
points(ord, cex=1.3, pch=21, col=1, bg = gr)
# Make sites with no shared species as NA in Manhattan dissimilarities
dis <- vegdist(dune, "manhattan")
is.na(dis) <- no.shared(dune)

Run the code above in your browser using DataLab