Learn R Programming

GMD (version 0.3.3)

heatmap.3: Enhanced Heatmap Representation with Dendrogram and Partition

Description

Enhanced heatmap representation with dendrograms and partition given the elbow criterion or a desired number of clusters. 1) a dendrogram added to the left side and to the top, according to cluster analysis; 2) partitions in highlighted rectangles, according to the "elbow" rule or a desired number of clusters.

Usage

heatmap.3(x, diss = inherits(x, "dist"), Rowv = TRUE, Colv = TRUE, dendrogram = c("both", "row", "column", "none"), dist.row, dist.col, dist.FUN = gdist, dist.FUN.MoreArgs = list(method = "euclidean"), hclust.row, hclust.col, hclust.FUN = hclust, hclust.FUN.MoreArgs = list(method = "ward"), scale = c("none", "row", "column"), na.rm = TRUE, cluster.by.row = FALSE, cluster.by.col = FALSE, kr = NA, kc = NA, row.clusters = NA, col.clusters = NA, revR = FALSE, revC = FALSE, add.expr, breaks, x.center, color.FUN = gplots::bluered, sepList = list(NULL, NULL), sep.color = c("gray45", "gray45"), sep.lty = 1, sep.lwd = 2, cellnote, cex.note = 1, notecol = "cyan", na.color = par("bg"), trace = c("none", "column", "row", "both"), tracecol = "cyan", hline, vline, linecol = tracecol, labRow = TRUE, labCol = TRUE, srtRow = NULL, srtCol = NULL, sideRow = 4, sideCol = 1, margin.for.labRow, margin.for.labCol, ColIndividualColors, RowIndividualColors, cexRow, cexCol, labRow.by.group = FALSE, labCol.by.group = FALSE, key = TRUE, key.title = "Color Key", key.xlab = "Value", key.ylab = "Count", keysize = 1.5, mapsize = 9, mapratio = 4/3, sidesize = 3, cex.key.main = 0.75, cex.key.xlab = 0.75, cex.key.ylab = 0.75, density.info = c("histogram", "density", "none"), denscol = tracecol, densadj = 0.25, main = "Heatmap", sub = "", xlab = "", ylab = "", cex.main = 2, cex.sub = 1.5, font.main = 2, font.sub = 3, adj.main = 0.5, mgp.main = c(1.5, 0.5, 0), mar.main = 3, mar.sub = 3, if.plot = TRUE, plot.row.partition = FALSE, plot.col.partition = FALSE, cex.partition = 1.25, color.partition.box = "gray45", color.partition.border = "#FFFFFF", plot.row.individuals = FALSE, plot.col.individuals = FALSE, plot.row.clusters = FALSE, plot.col.clusters = FALSE, plot.row.clustering = FALSE, plot.col.clustering = FALSE, plot.row.individuals.list = FALSE, plot.col.individuals.list = FALSE, plot.row.clusters.list = FALSE, plot.col.clusters.list = FALSE, plot.row.clustering.list = FALSE, plot.col.clustering.list = FALSE, row.data = FALSE, col.data = FALSE, if.plot.info = FALSE, text.box, cex.text = 1, ...)

Arguments

x
data matrix or data frame, or dissimilarity matrix or `dist' object determined by the value of the 'diss' argument. ##diss logical flag: if TRUE (default for dist or dissimilarity objects), then x is assumed to be a dissimilarity matrix. If FALSE,then x is treated as a matrix of observations by variables.
diss
logical, whether the x is a dissimilarity matrix
Rowv
one of the following: TRUE, a `dend' object, a vector or NULL/FALSE; determines if and how the row dendrogram should be reordered.
Colv
one of the following: "Rowv", TRUE, a `dend' object, a vector or NULL/FALSE; determines if and how the column dendrogram should be reordered.
dendrogram
character string indicating whether to draw 'none', 'row', 'column' or 'both' dendrograms. Defaults to 'both'.
dist.row
a dist object for row observations.
dist.col
a dist object for column observations.
dist.FUN
function used to compute the distance (dissimilarity) between both rows and columns. Defaults to gdist.
dist.FUN.MoreArgs
a list of other arguments to be passed to gdist
hclust.row
a hclust object (as produced by hclust) for row observations.
hclust.col
a hclust object (as produced by hclust) for column observations.
hclust.FUN
function used to compute the hierarchical clustering when "Rowv" or "Colv" are not dendrograms. Defaults to hclust.
hclust.FUN.MoreArgs
a list of other arguments to be passed to hclust. Defaults to list(method="ward")
scale
character indicating if the values should be centered and scaled in either the row direction or the column direction, or none. The default is "none".
na.rm
logical, whether NA values will be removed when scaling.
cluster.by.row
logical, whether to cluster row observations and reorder the input accordingly.
cluster.by.col
logical, whether to cluster column observations and reorder the input accordingly.
kr
numeric, number of clusters in rows; suppressed when row.cluster is specified. DEFAULT: NULL.
kc
numeric, number of clusters in columns; suppressed when col.cluster is specified. DEFAULT: NULL.
row.clusters
a numerical vector, indicating the cluster labels of row observations.
col.clusters
a numerical vector, indicating the cluster labels of column observations.
revR
logical indicating if the row order should be 'rev'ersed for plotting.
revC
logical indicating if the column order should be 'rev'ersed for plotting, such that e.g., for the symmetric case, the symmetry axis is as usual.
add.expr
expression that will be evaluated after the call to image. Can be used to add components to the plot.
breaks
numeric, either a numeric vector indicating the splitting points for binning x into colors, or a integer number of break points to be used, in which case the break points will be spaced equally between range(x). DEFAULT: 16 when not specified.
x.center
numeric, a value of x for centering colors to
color.FUN
function or function name in characters, for colors in the heatmap
sepList
a list of length 2 giving the row and column lines of separation.
sep.color
color for lines of separation.
sep.lty
line type for lines of separation.
sep.lwd
line width for lines of separation.
cellnote
(optional) matrix of character strings which will be placed within each color cell, e.g. cell labels or p-value symbols.
cex.note
relative font size of cellnote.
notecol
color of cellnote.
na.color
Color to use for missing value (NA). Defaults to the plot background color.
trace
character string indicating whether a solid "trace" line should be drawn across "row"s or down "column"s, "both" or "none". The distance of the line from the center of each color-cell is proportional to the size of the measurement. Defaults to "none".
tracecol
character string giving the color for "trace" line. Defaults to "cyan";
hline
Vector of values within cells where a horizontal dotted line should be drawn. only plotted if 'trace' is 'row' or 'both'. Default to the median of the breaks.
vline
Vector of values within cells where a vertical dotted line should be drawn; only drawn if 'trace' 'column' or 'both'. vline default to the median of the breaks.
linecol
the color of hline and vline. Defaults to the value of 'tracecol'.
labRow
character vectors with row labels to use; defaults to rownames(x).
labCol
character vectors with column labels to use; defaults to colnames(x).
srtRow
numerical, specifying (in degrees) how row labels should be rotated. See help("par", package="graphics").
srtCol
numerical, specifying (in degrees) how col labels should be rotated. See help("par", package="graphics").
sideRow
2 or 4, which side row labels display.
sideCol
1 or 3, which side row labels display.
margin.for.labRow
a numerical value gives the margin to plot labRow.
margin.for.labCol
a numerical value gives the margin to plot labCol.
ColIndividualColors
(optional) character vector of length ncol(x) containing the color names for a horizontal side bar that may be used to annotate the columns of x.
RowIndividualColors
(optional) character vector of length nrow(x) containing the color names for a vertical side bar that may be used to annotate the rows of x.
cexRow
positive numbers, used as 'cex.axis' in for column axis labeling. The default currently only uses number of columns.
cexCol
positive numbers, used as 'cex.axis' in for the row axis labeling. The default currently only uses number of rows.
labRow.by.group
logical, whether group unique labels for rows.
labCol.by.group
logical, whether group unique labels for columns.
key
logical indicating whether a color-key should be shown.
key.title
character, title of the color-key ["Color Key"]
key.xlab
character, xlab of the color-key ["Value"]
key.ylab
character, ylab of the color-key ["Count"]
keysize
numeric value indicating the relative size of the key
mapsize
numeric value indicating the relative size of the heatmap.
mapratio
the width-to-height ratio of the heatmap.
sidesize
numeric value indicating the relative size of the sidebars.
cex.key.main
a numerical value giving the amount by which main-title of color-key should be magnified relative to the default.
cex.key.xlab
a numerical value giving the amount by which xlab of color-key should be magnified relative to the default.
cex.key.ylab
a numerical value giving the amount by which ylab of color-key should be magnified relative to the default.
density.info
character string indicating whether to superimpose a 'histogram', a 'density' plot, or no plot ('none') on the color-key.
denscol
character string giving the color for the density display specified by 'density.info', defaults to the same value as 'tracecol'.
densadj
Numeric scaling value for tuning the kernel width when a density plot is drawn on the color key. (See the 'adjust' parameter for the 'density' function for details.) Defaults to 0.25.
main
an overall title for the plot. See help("title", package="graphics").
sub
a subtitle for the plot, describing the distance and/or alignment gap (the "shift").
xlab
a title for the x axis. See help("title", package="graphics").
ylab
a title for the y axis. See help("title", package="graphics").
cex.main
a numerical value giving the amount by which main-title should be magnified relative to the default.
cex.sub
a numerical value giving the amount by which sub-title should be magnified relative to the default.
font.main
An integer which specifies which font to use for main-title.
font.sub
An integer which specifies which font to use for sub-title.
adj.main
The value of 'adj' determines the way in which main-title strings are justified.
mgp.main
the margin line (in 'mex' units) for the main-title.
mar.main
a numerical vector of the form c(bottom, left, top, right) which gives the number of lines of margin to be specified on the four sides of the main-title.
mar.sub
a numerical vector of the form c(bottom, left, top, right) which gives the number of lines of margin to be specified on the four sides of the sub-title.
if.plot
logical, whether to plot. Reordered matrix is returned without graphical output if FALSE.
plot.row.partition
logical, whether to plot row partition.
plot.col.partition
logical, whether to plot column partition.
cex.partition
a numerical value giving the amount by which partition should be magnified relative to the default.
color.partition.box
color for the partition box.
color.partition.border
color for the partition border.
plot.row.individuals
logical, whether to make a plot of row observations.
plot.col.individuals
logical, whether to make a plot of column observations.
plot.row.clusters
logical, whether to make a summary plot of row clusters.
plot.col.clusters
logical, whether to make a summary plot of column clusters.
plot.row.clustering
logical, whether to make a summary plot of overall row clustering.
plot.col.clustering
logical, whether to make a summary plot of overall column clustering.
plot.row.individuals.list
a list of expressions that is used to plot.row.individuals
plot.col.individuals.list
a list of expressions that is used to plot.col.individuals
plot.row.clusters.list
a list of expressions that is used to plot.row.clusters
plot.col.clusters.list
a list of expressions that is used to plot.col.clusters
plot.row.clustering.list
a list of expressions that is used to plot.row.clustering
plot.col.clustering.list
a list of expressions that is used to plot.col.clustering
row.data
(optional) data used to plot.row.individuals, plot.row.clusters or plot.row.clustering
col.data
(optional) data used to plot.col.individuals, plot.col.clusters or plot.col.clustering
if.plot.info
logical, whether to plot text.box.
text.box
character plotted when if.plot.info is TRUE.
cex.text
a numerical value giving the amount by which text.box should be magnified relative to the default.
...
arguments to be passed to method heatmap.3. e help("image", package="graphics").

Value

A reordered matrix according to row or/and col dendrogram(s) and indices that used for reordering.

Details

Enhanced heatmap representation with partition and summary statistics (optional). This is an enhanced version of `heatmap.2' function in the Package gplots. The enhancement includes: 1) Improved performance with optional input of precomputed dist object and hclust object. 2) Highlight of specific cells using rectangles. For instance, the cells of clusters of interests. (Examples should be included in future.) 3) Add-on plots in addition to the heatmap, such as cluster-wise summary plots and overall clustering summary plots, to the right of or under the heatmap.

Examples

Run this code
## ------------------------------------------------------------------------
## Example1: mtcars
## ------------------------------------------------------------------------
## load library
require("GMD")

## load data
data(mtcars)

## heatmap on raw data
x  <- as.matrix(mtcars)

dev.new(width=10,height=8)
heatmap.3(x)                               # default, with reordering and dendrogram
## Not run: 
# heatmap.3(x, Rowv=FALSE, Colv=FALSE)       # no reordering and no dendrogram
# heatmap.3(x, dendrogram="none")            # reordering without dendrogram
# heatmap.3(x, dendrogram="row")        # row dendrogram with row (and col) reordering
# heatmap.3(x, dendrogram="row", Colv=FALSE) # row dendrogram with only row reordering
# heatmap.3(x, dendrogram="col")             # col dendrogram
# heatmap.3(x, dendrogram="col", Rowv=FALSE) # col dendrogram with only col reordering
# heatmapOut <-
#   heatmap.3(x, scale="column")             # sacled by column
# names(heatmapOut)                          # view the list that is returned
# heatmap.3(x, scale="column", x.center=0)   # colors centered around 0
# heatmap.3(x, scale="column",trace="column")  # trun "trace" on
# ## End(Not run)

## coloring cars (row observations) by brand
brands <- sapply(rownames(x), function(e) strsplit(e," ")[[1]][1])
names(brands) <- c()
brands.index <- as.numeric(as.factor(brands))
RowIndividualColors <- rainbow(max(brands.index))[brands.index]
heatmap.3(x, scale="column", RowIndividualColors=RowIndividualColors)

## coloring attributes (column features) randomly (just for a test :)
heatmap.3(x, scale="column", ColIndividualColors=rainbow(ncol(x)))

## add a single plot for all row individuals
dev.new(width=12,height=8)
expr1 <- list(quote(plot(row.data[rowInd,"hp"],1:nrow(row.data),
xlab="hp",ylab="",yaxt="n",main="Gross horsepower")),
quote(axis(2,1:nrow(row.data),rownames(row.data)[rowInd],las=2)))
heatmap.3(x, scale="column", plot.row.individuals=TRUE, row.data=x,
          plot.row.individuals.list=list(expr1))


## ------------------------------------------------------------------------
## Example2: ruspini
## ------------------------------------------------------------------------
## load library
require("GMD")
require(cluster)

## load data
data(ruspini)

## heatmap on a `dist' object
x <- gdist(ruspini)
main <- "Heatmap of Ruspini data"
dev.new(width=10,height=10)
heatmap.3(x, main=main, mapratio=1) # with a title and a map in square!
## Not run: 
# heatmap.3(x, main=main, revC=TRUE)  # reverse column for a symmetric look
# heatmap.3(x, main=main, kr=2, kc=2) # partition by predefined number of clusters
# ## End(Not run)
## show partition by elbow
css.multi.obj <- css.hclust(x,hclust(x))
elbow.obj <- elbow.batch(css.multi.obj,ev.thres=0.90,inc.thres=0.05)
heatmap.3(x, main=main, revC=TRUE, kr=elbow.obj$k, kc=elbow.obj$k)

## Not run: 
# ## show elbow info as subtitle
# heatmap.3(x, main=main, sub=sub("\n"," ",attr(elbow.obj,"description")),
# cex.sub=1.25,revC=TRUE,kr=elbow.obj$k, kc=elbow.obj$k)
# ## End(Not run)

Run the code above in your browser using DataLab