"dendrogram"
provides general functions for handling
tree-like structures. It is intended as a replacement for similar
functions in hierarchical clustering and classification/regression
trees, such that all of these can use the same engine for plotting or
cutting trees.as.dendrogram(object, …)
# S3 method for hclust
as.dendrogram(object, hang = -1, check = TRUE, …)# S3 method for dendrogram
as.hclust(x, …)
# S3 method for dendrogram
plot(x, type = c("rectangle", "triangle"),
center = FALSE,
edge.root = is.leaf(x) || !is.null(attr(x,"edgetext")),
nodePar = NULL, edgePar = list(),
leaflab = c("perpendicular", "textlike", "none"),
dLeaf = NULL, xlab = "", ylab = "", xaxt = "n", yaxt = "s",
horiz = FALSE, frame.plot = FALSE, xlim, ylim, …)
# S3 method for dendrogram
cut(x, h, …)
# S3 method for dendrogram
merge(x, y, …, height,
adjust = c("auto", "add.max", "none"))
# S3 method for dendrogram
nobs(object, …)
# S3 method for dendrogram
print(x, digits, …)
# S3 method for dendrogram
rev(x)
# S3 method for dendrogram
str(object, max.level = NA, digits.d = 3,
give.attr = FALSE, wid = getOption("width"),
nest.lev = 0, indent.str = "",
last.str = getOption("str.dendrogram.last"), stem = "--",
…)
is.leaf(object)
"dendrogram"
."dendrogram"
.plot.hclust
.object
should be checked for
validity. This check is not necessary when x
is known to be
valid such as when it is the direct result of hclust()
. The
default is check=TRUE
, e.g. for protecting against memory
explosion with invalid inputs.TRUE
, nodes are plotted centered with
respect to the leaves in the branch. Otherwise (default), plot them
in the middle of all direct child nodes.list
of plotting parameters to use for the
nodes (see points
) or NULL
by default which
does not draw symbols at the nodes. The list may contain components
named pch
, cex
, col
, xpd
,
and/or bg
each of
which can have length two for specifying separate attributes for
inner nodes and leaves. Note that the default of
pch
is 1:2
, so you may want to use pch = NA
if
you specify nodePar
.list
of plotting parameters to use for the
edge segments
and labels (if there's an
edgetext
). The list may contain components
named col
, lty
and lwd
(for the segments),
p.col
, p.lwd
, and p.lty
(for the
polygon
around the text) and t.col
for the text
color. As with nodePar
, each can have length two for
differentiating leaves and inner nodes.
"perpendicular"
write text vertically (by default).
"textlike"
writes text horizontally (in a rectangle), and
"none"
suppresses leaf labels.NULL
as per default, 3/4 of a letter width or height is used.plot.default
.NULL
), the default is ten percent larger than
the (larger of the) two component heights."auto"
, checks if the (first) two
dendrograms both start at 1
; if they do, code"add.max" is
chosen, which adds the maximum of the previous dendrogram leaf
values to each leaf of the “next” dendrogram. Specifying
adjust
to another value skips the check and hence is a tad
more efficient.plot.default
. The defaults for these show the full
dendrogram.print.default
.str
, see str.default()
. Note that
give.attr = FALSE
still shows height
and members
attributes for each node.str()
specifying how the
last branch (at each level) should start and the stem
to use for each dendrogram branch. In some environments, using
last.str = "'"
will provide much nicer looking output, than
the historical default last.str = "`"
.merge()
make use of
recursion. For deep trees it may be necessary to increase
options("expressions")
: if you do, you are likely to need
to set the C stack size (Cstack_info()[["size"]]
) larger
than the default where possible.z
is z[[1]]
, the second branch of the
corresponding subtree is z[[1]][[2]]
, or shorter
z[[c(1,2)]]
, etc.. Each node of the tree
carries some information needed for efficient plotting or cutting as
attributes, of which only members
, height
and
leaf
for leaves are compulsory:
members
height
midpoint
plot(*, center = FALSE)
.label
x.member
cut()$upper
,
the number of former members; more generally a substitute
for the members
component used for ‘horizontal’
(when horiz = FALSE
, else ‘vertical’) alignment.edgetext
nodePar
points
plotting, see the nodePar
argument above.edgePar
segments
plotting of the
edge leading to the node, and drawing of the edgetext
if
available, see the edgePar
argument above.leaf
TRUE
, the node is a leaf of
the tree.cut.dendrogram()
returns a list with components $upper
and $lower
, the first is a truncated version of the original
tree, also of class dendrogram
, the latter a list with the
branches obtained from cutting the tree, each a dendrogram
. There are [[
, print
, and str
methods for "dendrogram"
objects where the first one
(extraction) ensures that selecting sub-branches keeps the class,
i.e., returns a dendrogram even if only a leaf.
On the other hand, [
(single bracket) extraction
returns the underlying list structure. Objects of class "hclust"
can be converted to class
"dendrogram"
using method as.dendrogram()
, and since R
2.13.0, there is also a as.hclust()
method as an inverse. rev.dendrogram
simply returns the dendrogram x
with
reversed nodes, see also reorder.dendrogram
. The merge(x, y, ...)
method merges two or more
dendrograms into a new one which has x
and y
(and
optional further arguments) as branches. Note that before R 3.1.2,
adjust = "none"
was used implicitly, which is invalid when,
e.g., the dendrograms are from as.dendrogram(hclust(..))
. nobs(object)
returns the total number of leaves (the
members
attribute, see above). is.leaf(object)
returns logical indicating if object
is a
leaf (the most simple dendrogram). plotNode()
and plotNodeLimit()
are helper functions.dendrapply
for applying a function to each node.
order.dendrogram
and reorder.dendrogram
;
further, the labels
method.require(graphics); require(utils)
hc <- hclust(dist(USArrests), "ave")
(dend1 <- as.dendrogram(hc)) # "print()" method
str(dend1) # "str()" method
str(dend1, max = 2, last.str = "'") # only the first two sub-levels
oo <- options(str.dendrogram.last = "\\") # yet another possibility
str(dend1, max = 2) # only the first two sub-levels
options(oo) # .. resetting them
op <- par(mfrow = c(2,2), mar = c(5,2,1,4))
plot(dend1)
## "triangle" type and show inner nodes:
plot(dend1, nodePar = list(pch = c(1,NA), cex = 0.8, lab.cex = 0.8),
type = "t", center = TRUE)
plot(dend1, edgePar = list(col = 1:2, lty = 2:3),
dLeaf = 1, edge.root = TRUE)
plot(dend1, nodePar = list(pch = 2:1, cex = .4*2:1, col = 2:3),
horiz = TRUE)
## simple test for as.hclust() as the inverse of as.dendrogram():
stopifnot(identical(as.hclust(dend1)[1:4], hc[1:4]))
dend2 <- cut(dend1, h = 70)
plot(dend2$upper)
## leaves are wrong horizontally:
plot(dend2$upper, nodePar = list(pch = c(1,7), col = 2:1))
## dend2$lower is *NOT* a dendrogram, but a list of .. :
plot(dend2$lower[[3]], nodePar = list(col = 4), horiz = TRUE, type = "tr")
## "inner" and "leaf" edges in different type & color :
plot(dend2$lower[[2]], nodePar = list(col = 1), # non empty list
edgePar = list(lty = 1:2, col = 2:1), edge.root = TRUE)
par(op)
d3 <- dend2$lower[[2]][[2]][[1]]
stopifnot(identical(d3, dend2$lower[[2]][[c(2,1)]]))
str(d3, last.str = "'")
## to peek at the inner structure "if you must", use '[..]' indexing :
str(d3[2][[1]]) ## or the full
str(d3[])
## merge() to join dendrograms:
(d13 <- merge(dend2$lower[[1]], dend2$lower[[3]]))
## merge() all parts back (using default 'height' instead of original one):
den.1 <- Reduce(merge, dend2$lower)
## or merge() all four parts at same height --> 4 branches (!)
d. <- merge(dend2$lower[[1]], dend2$lower[[2]], dend2$lower[[3]],
dend2$lower[[4]])
## (with a warning) or the same using do.call :
stopifnot(identical(d., do.call(merge, dend2$lower)))
plot(d., main = "merge(d1, d2, d3, d4) |-> dendrogram with a 4-split")
## "Zoom" in to the first dendrogram :
plot(dend1, xlim = c(1,20), ylim = c(1,50))
nP <- list(col = 3:2, cex = c(2.0, 0.75), pch = 21:22,
bg = c("light blue", "pink"),
lab.cex = 0.75, lab.col = "tomato")
plot(d3, nodePar= nP, edgePar = list(col = "gray", lwd = 2), horiz = TRUE)
<!-- %% now add some "edgetext" : -->
addE <- function(n) {
if(!is.leaf(n)) {
attr(n, "edgePar") <- list(p.col = "plum")
attr(n, "edgetext") <- paste(attr(n,"members"),"members")
}
n
}
d3e <- dendrapply(d3, addE)
plot(d3e, nodePar = nP)
plot(d3e, nodePar = nP, leaflab = "textlike")
<!-- %% BUG: edge labeling *and* leaflab = "textlike" both fail with horiz = TRUE: -->
<!-- %% BUG plot(d3e, nodePar = nP, leaflab = "textlike", horiz = TRUE) -->
Run the code above in your browser using DataLab