Learn R Programming

Seurat (version 1.2.1)

mean.var.plot: Identify variable genes

Description

Identifies genes that are outliers on a 'mean variability plot'. First, uses a function to calculate average expression (fxn.x) and dispersion (fxn.y) for each gene. Next, divides genes into num.bin (deafult 20) bins based on their average expression, and calculates z-scores for dispersion within each bin. The purpose of this is to identify variable genes while controlling for the strong relationship between variability and average expression.

Usage

mean.var.plot(object, fxn.x = expMean, fxn.y = logVarDivMean, do.plot = TRUE, set.var.genes = TRUE, do.text = TRUE, x.low.cutoff = 4, x.high.cutoff = 8, y.cutoff = 2, y.high.cutoff = 12, cex.use = 0.5, cex.text.use = 0.5, do.spike = FALSE, pch.use = 16, col.use = "black", spike.col.use = "red", plot.both = FALSE, do.contour = TRUE, contour.lwd = 3, contour.col = "white", contour.lty = 2, num.bin = 20)

Arguments

object
Seurat object
fxn.x
Function to compute x-axis value (average expression). Default is to take the mean of the detected (i.e. non-zero) values
fxn.y
Function to compute y-axis value (dispersion). Default is to take the standard deviation of all values/
do.plot
Plot the average/dispersion relationship
set.var.genes
Set object@var.genes to the identified variable genes (default is TRUE)
do.text
Add text names of variable genes to plot (default is TRUE)
x.low.cutoff
Bottom cutoff on x-axis for identifying variable genes
x.high.cutoff
Top cutoff on x-axis for identifying variable genes
y.cutoff
Bottom cutoff on y-axis for identifying variable genes
y.high.cutoff
Top cutoff on y-axis for identifying variable genes
cex.use
Point size
cex.text.use
Text size
do.spike
FALSE by default. If TRUE, color all genes starting with ^ERCC a different color
pch.use
Pch value for points
col.use
Color to use
spike.col.use
if do.spike, color for spike-in genes
plot.both
Plot both the scaled and non-scaled graphs.
do.contour
Draw contour lines calculated based on all genes
contour.lwd
Contour line width
contour.col
Contour line color
contour.lty
Contour line type
num.bin
Total number of bins to use in the scaled analysis (default is 20)

Value

Returns a Seurat object, placing variable genes in object@var.genes. The result of all analysis is stored in object@mean.var

Details

Exact parameter settings may vary empirically from dataset to dataset, and based on visual inspection of the plot. Setting the y.cutoff parameter to 2 identifies genes that are more than two standard deviations away from the average dispersion within a bin. The default X-axis function is the mean expression level, and for Y-axis it is the log(Variance/mean). All mean/variance calculations are not performed in log-space, but the results are reported in log-space - see relevant functions for exact details.