pheatmap.multilevel: Clustered heatmap

Description

A function to draw clustered heatmaps from the pheatmap package for the sPLS-DA multilevel functions for one and two factors only.

Usage

## S3 method for class 'splsda1fact':
pheatmap.multilevel(
result, 
cluster = NULL, 
color = colorRampPalette(rev(c("#D73027", "#FC8D59", "#FEE090", 
"#FFFFBF", "#E0F3F8", "#91BFDB", "#4575B4")))(100), 
col_sample = NULL, col_stimulation = NULL, 
label_annotation = NULL, 
breaks = NA, border_color = "grey60", 
cellwidth = NA, 
cellheight = NA, 
scale = "none", cluster_rows = TRUE, cluster_cols = TRUE, 
clustering_distance_rows = "euclidean", 
clustering_distance_cols = "euclidean", 
clustering_method = "complete", treeheight_row = ifelse(cluster_rows, 50, 0), 
treeheight_col = ifelse(cluster_cols, 50, 0), 
legend = TRUE, annotation = NA, annotation_colors = NA, annotation_legend = TRUE, 
show_rownames = TRUE, show_colnames = TRUE, fontsize = 10, 
fontsize_row = fontsize, fontsize_col = fontsize, filename = NA, 
width = NA, height = NA, order_sample = NULL, tab.prob.gene = NULL, ...)
## S3 method for class 'splsda2fact':
pheatmap.multilevel(
result, cluster = NULL, 
color = colorRampPalette(rev(c("#D73027", "#FC8D59", "#FEE090", 
"#FFFFBF", "#E0F3F8", "#91BFDB", "#4575B4")))(100), 
col_sample = NULL, col_stimulation = NULL, col_time = NULL, 
label_color_stimulation = NULL, label_color_time = NULL, 
label_annotation = NULL, breaks = NA, border_color = "grey60", 
cellwidth = NA, cellheight = NA, scale = "none", cluster_rows = TRUE, 
cluster_cols = TRUE, clustering_distance_rows = "euclidean", 
clustering_distance_cols = "euclidean", clustering_method = "complete", 
treeheight_row = ifelse(cluster_rows, 50, 0), 
treeheight_col = ifelse(cluster_cols, 50, 0),
legend = TRUE, annotation = NA, annotation_colors = NA, 
annotation_legend = TRUE, show_rownames = TRUE, show_colnames = TRUE, 
fontsize = 10, fontsize_row = fontsize, fontsize_col = fontsize, 
filename = NA, width = NA, height = NA, order_sample = NULL, 
tab.prob.gene = NULL, ...)

Arguments

result

a result from function multilevel with method = 'splsda' (can only apply to a splsda analysis type).

cluster

a vector that would indicate a specific ordering (clustering) on the variables. By default should be set to NULL

col_sample

vector of colors indicating the color of each individual (the length of that vector should be equal to the number of unique individuals)

col_stimulation

vector of colors indicating the color of each condition (the length of that vector should be equal to the number of conditions), i.e. the condition indicated in the 2nd colum of the design matrix in the multilevel function.

col_time

for a two-factor analysis, vector of colors indicating the color of the factors of the second level (the length of that vector should be equal to the number of conditions), i.e. the condition indicated in the 3rd colum of the design matrix in

label_color_stimulation

character vector indicating the label of the first factor, see details

label_color_time

character vector indicating the label of the second factor, see details

label_annotation

set to NULL by default.

order_sample

vector indicatin the reordering of the samples, set to NULL by default.

color

vector of colors used in heatmap.

breaks

a sequence of numbers that covers the range of values in mat and is one element longer than color vector. Used for mapping values to colors. Useful, if needed to map certain values to certain colors, to certain values. If value is NA then the breaks are c

border_color

color of cell borders on heatmap, use NA if no border should be drawn.

cellwidth

individual cell width in points. If left as NA, then the values depend on the size of plotting window.

cellheight

individual cell height in points. If left as NA, then the values depend on the size of plotting window.

scale

character indicating if the values should be centered and scaled in either the row direction or the column direction, or none. Corresponding values are "row", "column" and "none"

cluster_rows

boolean values determining if rows should be clustered,

cluster_cols

boolean values determining if columns should be clustered.

clustering_distance_rows

distance measure used in clustering rows. Possible values are "correlation" for Pearson correlation and all the distances supported by dist, such as "euclidean", etc. If the value is

clustering_distance_cols

distance measure used in clustering columns. Possible values the same as for clustering_distance_rows.

clustering_method

clustering method used. Accepts the same values as hclust.

treeheight_row

the height of a tree for rows, if these are clustered. Default value 50 points.

treeheight_col

the height of a tree for columns, if these are clustered. Default value 50 points.

legend

boolean value that determines if legend should be drawn or not.

annotation

data frame that specifies the annotations shown on top of the columns. Each row defines the features for a specific column. The columns in the data and rows in the annotation are matched using corresponding row and column names. Note that color schemes ta

annotation_colors

list for specifying annotation track colors manually. It is possible to define the colors for only some of the features. Check examples for details.

annotation_legend

boolean value showing if the legend for annotation tracks should be drawn.

show_rownames

boolean specifying if column names are be shown.

show_colnames

boolean specifying if column names are be shown.

fontsize

base fontsize for the plot

fontsize_row

fontsize for rownames (Default: fontsize)

fontsize_col

fontsize for colnames (Default: fontsize)

filename

file path where to save the picture. Filetype is decided by the extension in the path. Currently following formats are supported: png, pdf, tiff, bmp, jpeg. Even if the plot does not fit into the plotting window, the file size is calculated so that the

width

manual option for determining the output file width in

height

manual option for determining the output file height in inches.

tab.prob.gene

A character vector indicating the name of all genes.

...

graphical parameters for the text used in plot. Parameters passed to grid.text, see gpar.

encoding

latin1

Details

This function has been borrowed from the pheatmap function of the pheatmap package. See help(pheatmap) for more details about the arguments of the function.

In the multilevel function, the factors indicated in the vector or the data frame cond must match the arguments label_color_stimulation and label_color_time, see example below.

References

On multilevel analysis: Liquet, B., Le Cao, K.-A., Hocini, H. and Thiebaut, R. (2012) A novel approach for biomarker selection and the integration of repeated measures experiments from two platforms. BMC Bioinformatics 13:325.

Westerhuis, J. A., van Velzen, E. J., Hoefsloot, H. C., and Smilde, A. K. (2010). Multivariate paired data analysis: multilevel PLSDA versus OPLSDA. Metabolomics, 6(1), 119-128.

Examples

Run this code

## First example: one-factor analysis with sPLS-DA
# -------------------
data(vac18)
X <- vac18$genes
Y <- vac18$stimulation

design <- data.frame(sample = vac18$sample, 
                     stimul = vac18$stimulation)
vac18.splsda.multilevel <- multilevel(X, ncomp = 3, design = design,
                                         method = "splsda", keepX = c(30, 137, 123))


# set up colors for pheatmap
col.samp <- c("lightgreen", "red", "lightblue", "darkorange",
              "purple", "maroon", "blue", "chocolate", "turquoise",
              "tomato1", "pink2", "aquamarine")
col.stimu <- c("darkblue", "purple", "green4","red3")
col.stimu <- col.stimu[as.numeric(Y)]
col.stimu <- unique(col.stimu)

pheatmap.multilevel(vac18.splsda.multilevel, 
                    # colors:
                    col_sample = col.samp, 
                    col_stimulation = col.stimu,
                    #labels:
                    label_annotation = c("Subject", "Stimulus"),
                    # scaling:
                    scale = 'row',
                    # distances and clutering
                    clustering_distance_rows = "euclidean", 
                    clustering_distance_cols = "euclidean", 
                    clustering_method = "complete",
                    #  show col/row names and font
                    show_colnames = FALSE,
                    show_rownames = FALSE, 
                    fontsize = 8, 
                    fontsize_row = 3,
                    fontsize_col = 2,
                    border = FALSE, 
                    width = 10)

## Second example: two-factor analysis with sPLS-DA
# --------------------
data(vac18.simulated) # on the simulated data this time

X <- vac18.simulated$genes
design <- data.frame(sample = vac18.simulated$sample,
                     stimul = vac18.simulated$stimulation,
                     time = vac18.simulated$time)

vac18.splsda2.multilevel <- multilevel(X, ncomp = 2, design = design,
                            keepX = c(200, 200), method = 'splsda')

# set up colors for each level of pheatmap 
col.sample <- c("lightgreen", "red","lightblue","darkorange","purple","maroon") # 6 samples
col.time <- c("pink","lightblue1") # two time points
col.stimu <- c('green', 'black', 'red', 'blue') # 4 stimulations
# set up labels for the 2 levels in design matrix
label.stimu <- unique(design[, 2])
label.time <- unique(design$time)

pheatmap.multilevel(vac18.splsda2.multilevel,
                                # colors:
                                col_sample=col.sample, 
                                col_stimulation=col.stimu, 
                                col_time=col.time,
                                #labels for each level
                                label_color_stimulation=label.stimu,
                                label_color_time=label.time, 
                                #clustering method
                                clustering_method="ward",
                                #show col/row names and font size
                                show_colnames = FALSE,
                                show_rownames = TRUE,
                                fontsize_row=2)

Run the code above in your browser using DataLab