Display taxa abundances as a heatmap.
taxa_heatmap(
biom,
rank = -1,
taxa = 6,
tracks = NULL,
grid = "bilbao",
other = FALSE,
unc = "singly",
lineage = FALSE,
label = TRUE,
label_size = NULL,
rescale = "none",
trees = TRUE,
clust = "complete",
dist = "euclidean",
asp = 1,
tree_height = 10,
track_height = 10,
legend = "right",
title = TRUE,
xlab.angle = "auto",
...
)A ggplot2 plot. The computed data points and ggplot
command are available as $data and $code,
respectively.
An rbiom object, such as from as_rbiom().
Any value accepted by as_rbiom() can also be given here.
What rank(s) of taxa to display. E.g. "Phylum",
"Genus", ".otu", etc. An integer vector can also be
given, where 1 is the highest rank, 2 is the second
highest, -1 is the lowest rank, -2 is the second
lowest, and 0 is the OTU "rank". Run biom$ranks to
see all options for a given rbiom object. Default: -1.
Which taxa to display. An integer value will show the top n
most abundant taxa. A value 0 <= n < 1 will show any taxa with that
mean abundance or greater (e.g. 0.1 implies >= 10%). A
character vector of taxa names will show only those named taxa.
Default: 6.
A character vector of metadata fields to display as tracks
at the top of the plot. Or, a list as expected by the tracks
argument of plot_heatmap(). Default: NULL
Color palette name, or a list as expected plot_heatmap().
Default: "bilbao"
Sum all non-itemized taxa into an "Other" taxa. When
FALSE, only returns taxa matched by the taxa
argument. Specifying TRUE adds "Other" to the returned set.
A string can also be given to imply TRUE, but with that
value as the name to use instead of "Other".
Default: FALSE
How to handle unclassified, uncultured, and similarly ambiguous taxa names. Options are:
"singly" - Replaces them with the OTU name.
"grouped" - Replaces them with a higher rank's name.
"drop" - Excludes them from the result.
"asis" - To not check/modify any taxa names.
Abbreviations are allowed. Default: "singly"
Include all ranks in the name of the taxa. For instance,
setting to TRUE will produce
Bacteria; Actinobacteria; Coriobacteriia; Coriobacteriales.
Otherwise the taxa name will simply be Coriobacteriales. You
want to set this to TRUE when unc = "asis" and you have taxa
names (such as Incertae_Sedis) that map to multiple higher
level ranks. Default: FALSE
Label the matrix rows and columns. You can supply a list
or logical vector of length two to control row labels and column
labels separately, for example
label = c(rows = TRUE, cols = FALSE), or simply
label = c(TRUE, FALSE). Other valid options are "rows",
"cols", "both", "bottom", "right",
and "none".
Default: TRUE
The font size to use for the row and column labels. You
can supply a numeric vector of length two to control row label sizes
and column label sizes separately, for example
c(rows = 20, cols = 8), or simply c(20, 8).
Default: NULL, which computes:
pmax(8, pmin(20, 100 / dim(mtx)))
Rescale rows or columns to all have a common min/max.
Options: "none", "rows", or "cols".
Default: "none"
Draw a dendrogram for rows (left) and columns (top). You can
supply a list or logical vector of length two to control the row tree
and column tree separately, for example
trees = c(rows = TRUE, cols = FALSE),
or simply trees = c(TRUE, FALSE).
Other valid options are "rows", "cols", "both",
"left", "top", and "none".
Default: TRUE
Clustering algorithm for reordering the rows and columns by
similarity. You can supply a list or character vector of length two to
control the row and column clustering separately, for example
clust = c(rows = "complete", cols = NA), or simply
clust = c("complete", NA). Options are:
FALSE or NA - Disable reordering.
hclust class objectE.g. from stats::hclust().
"ward.D",
"ward.D2", "single", "complete",
"average", "mcquitty", "median", or
"centroid".
Default: "complete"
Distance algorithm to use when reordering the rows and columns
by similarity. You can supply a list or character vector of length
two to control the row and column clustering separately, for example
dist = c(rows = "euclidean", cols = "maximum"), or simply
dist = c("euclidean", "maximum"). Options are:
dist class objectE.g. from stats::dist() or bdiv_distmat().
"euclidean",
"maximum", "manhattan", "canberra",
"binary", or "minkowski".
Default: "euclidean"
Aspect ratio (height/width) for entire grid.
Default: 1 (square)
The height of the dendrogram or annotation
tracks as a percentage of the overall grid size. Use a numeric vector
of length two to assign c(top, left) independently.
Default: 10 (10% of the grid's height)
Where to place the legend. Options are: "right" or
"bottom". Default: "right"
Plot title. Set to TRUE for a default title, NULL for
no title, or any character string. Default: TRUE
Angle of the labels at the bottom of the plot.
Options are "auto", '0', '30', and '90'.
Default: "auto".
Additional arguments to pass on to ggplot2::theme().
Metadata can be displayed as colored tracks above the heatmap. Common use cases are provided below, with more thorough documentation available at https://cmmr.github.io/rbiom .
## Categorical ----------------------------
tracks = "Body Site"
tracks = list('Body Site' = "bright")
tracks = list('Body Site' = c('Stool' = "blue", 'Saliva' = "green"))## Numeric --------------------------------
tracks = "Age"
tracks = list('Age' = "reds")
## Multiple Tracks ------------------------
tracks = c("Body Site", "Age")
tracks = list('Body Site' = "bright", 'Age' = "reds")
tracks = list(
'Body Site' = c('Stool' = "blue", 'Saliva' = "green"),
'Age' = list('colors' = "reds") )
The following entries in the track definitions are understood:
colors - A pre-defined palette name or custom set of colors to map to.
range - The c(min,max) to use for scale values.
label - Label for this track. Defaults to the name of this list element.
side - Options are "top" (default) or "left".
na.color - The color to use for NA values.
bins - Bin a gradient into this many bins/steps.
guide - A list of arguments for guide_colorbar() or guide_legend().
All built-in color palettes are colorblind-friendly.
Categorical palette names: "okabe", "carto", "r4",
"polychrome", "tol", "bright", "light",
"muted", "vibrant", "tableau", "classic",
"alphabet", "tableau20", "kelly", and "fishy".
Numeric palette names: "reds", "oranges", "greens",
"purples", "grays", "acton", "bamako",
"batlow", "bilbao", "buda", "davos",
"devon", "grayC", "hawaii", "imola",
"lajolla", "lapaz", "nuuk", "oslo",
"tokyo", "turku", "bam", "berlin",
"broc", "cork", "lisbon", "roma",
"tofino", "vanimo", and "vik".
Other taxa_abundance:
sample_sums(),
taxa_boxplot(),
taxa_clusters(),
taxa_corrplot(),
taxa_stacked(),
taxa_stats(),
taxa_sums(),
taxa_table()
Other visualization:
adiv_boxplot(),
adiv_corrplot(),
bdiv_boxplot(),
bdiv_corrplot(),
bdiv_heatmap(),
bdiv_ord_plot(),
plot_heatmap(),
rare_corrplot(),
rare_multiplot(),
rare_stacked(),
stats_boxplot(),
stats_corrplot(),
taxa_boxplot(),
taxa_corrplot(),
taxa_stacked()
library(rbiom)
# Keep and rarefy the 10 most deeply sequenced samples.
hmp10 <- rarefy(hmp50, n = 10)
taxa_heatmap(hmp10, rank = "Phylum", tracks = "Body Site")
taxa_heatmap(hmp10, rank = "Genus", tracks = c("sex", "bo"))
taxa_heatmap(hmp10, rank = "Phylum", tracks = list(
'Sex' = list(colors = c(m = "#0000FF", f = "violetred")),
'Body Site' = list(colors = "muted", label = "Source") ))
Run the code above in your browser using DataLab