mst: Minimum Spanning Tree

Description

The function mst finds the minimum spanning tree between a set of observations using a matrix of pairwise distances.

The plot method plots the minimum spanning tree showing the links where the observations are identified by their numbers.

Usage

mst(X)
# S3 method for mst
plot(x, graph = "circle", x1 = NULL, x2 = NULL, ...)

Value

an object of class "mst" which is a square numeric matrix of size equal to the number of observations with either 1 if a link between the corresponding observations was found, or 0

otherwise. The names of the rows and columns of the distance matrix, if available, are given as rownames and colnames to the returned object.

Arguments

X: either a matrix that can be interpreted as a distance matrix, or an object of class "dist".
x: an object of class "mst" (e.g. returned by mst()).
graph: a character string indicating the type of graph to plot the minimum spanning tree; two choices are possible: "circle" where the observations are plotted regularly spaced on a circle, and "nsca" where the two first axes of a non-symmetric correspondence analysis are used to plot the observations (see Details below). If both arguments x1 and x2 are given, the argument graph is ignored.
x1: a numeric vector giving the coordinates of the observations on the x-axis. Both x1 and x2 must be specified to be used.
x2: a numeric vector giving the coordinates of the observations on the y-axis. Both x1 and x2 must be specified to be used.
...: further arguments to be passed to plot().

Author

Yvonnick Noel noel@univ-lille3.fr, Julien Claude julien.claude@umontpellier.fr and Emmanuel Paradis

Details

These functions provide two ways to plot the minimum spanning tree which try to space as much as possible the observations in order to show as clearly as possible the links. The option graph = "circle" simply plots regularly the observations on a circle, whereas graph = "nsca" uses a non-symmetric correspondence analysis where each observation is represented at the centroid of its neighbours.

Alternatively, the user may use any system of coordinates for the obsevations, for instance a principal components analysis (PCA) if the distances were computed from an original matrix of continous variables.

Examples

Run this code

require(stats)
X <- matrix(runif(200), 20, 10)
d <- dist(X)
PC <- prcomp(X)
M <- mst(d)
opar <- par(mfcol = c(2, 2))
plot(M)
plot(M, graph = "nsca")
plot(M, x1 = PC$x[, 1], x2 = PC$x[, 2])
par(opar)

Run the code above in your browser using DataLab