Learn R Programming

limma (version 3.28.14)

plotMDS: Multidimensional scaling plot of distances between gene expression profiles

Description

Plot samples on a two-dimensional scatterplot so that distances on the plot approximate the typical log2 fold changes between the samples.

Usage

"plotMDS"(x, top = 500, labels = NULL, pch = NULL, cex = 1, dim.plot = c(1,2), ndim = max(dim.plot), gene.selection = "pairwise", xlab = NULL, ylab = NULL, ...) "plotMDS"(x, labels = NULL, pch = NULL, cex = 1, dim.plot = NULL, xlab = NULL, ylab = NULL, ...)

Arguments

x
any data object which can be coerced to a matrix, such as ExpressionSet or EList.
top
number of top genes used to calculate pairwise distances.
labels
character vector of sample names or labels. Defaults to colnames(x).
pch
plotting symbol or symbols. See points for possible values. Ignored if labels is non-NULL.
cex
numeric vector of plot symbol expansions.
dim.plot
integer vector of length two specifying which principal components should be plotted.
ndim
number of dimensions in which data is to be represented.
gene.selection
character, "pairwise" to choose the top genes separately for each pairwise comparison between the samples or "common" to select the same genes for all comparisons.
xlab
title for the x-axis.
ylab
title for the y-axis.
...
any other arguments are passed to plot, and also to text (if pch is NULL).

Value

A plot is created on the current graphics device.An object of class "MDS" is invisibly returned. This is a list containing the following components:
distance.matrix
numeric matrix of pairwise distances between columns of x
cmdscale.out
output from the function cmdscale given the distance matrix
dim.plot
dimensions plotted
x
x-xordinates of plotted points
y
y-cordinates of plotted points
gene.selection
gene selection method

Details

This function is a variation on the usual multdimensional scaling (or principle coordinate) plot, in that a distance measure particularly appropriate for the microarray context is used. The distance between each pair of samples (columns) is the root-mean-square deviation (Euclidean distance) for the top top genes. Distances on the plot can be interpreted as leading log2-fold-change, meaning the typical (root-mean-square) log2-fold-change between the samples for the genes that distinguish those samples.

If gene.selection is "common", then the top genes are those with the largest standard deviations between samples. If gene.selection is "pairwise", then a different set of top genes is selected for each pair of samples. The pairwise feature selection may be appropriate for microarray data when different molecular pathways are relevant for distinguishing different pairs of samples.

See text for possible values for col and cex.

References

Ritchie, ME, Phipson, B, Wu, D, Hu, Y, Law, CW, Shi, W, and Smyth, GK (2015). limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Research 43, e47. http://nar.oxfordjournals.org/content/43/7/e47

See Also

cmdscale

An overview of diagnostic functions available in LIMMA is given in 09.Diagnostics.

Examples

Run this code
# Simulate gene expression data for 1000 probes and 6 microarrays.
# Samples are in two groups
# First 50 probes are differentially expressed in second group
sd <- 0.3*sqrt(4/rchisq(1000,df=4))
x <- matrix(rnorm(1000*6,sd=sd),1000,6)
rownames(x) <- paste("Gene",1:1000)
x[1:50,4:6] <- x[1:50,4:6] + 2
# without labels, indexes of samples are plotted.
mds <- plotMDS(x,  col=c(rep("black",3), rep("red",3)) )
# or labels can be provided, here group indicators:
plotMDS(mds,  col=c(rep("black",3), rep("red",3)), labels= c(rep("Grp1",3), rep("Grp2",3)))

Run the code above in your browser using DataLab