Identifies genes that are outliers on a 'mean variability plot'. First, uses a function to calculate average expression (mean.function) and dispersion (dispersion.function) for each gene. Next, divides genes into num.bin (deafult 20) bins based on their average expression, and calculates z-scores for dispersion within each bin. The purpose of this is to identify variable genes while controlling for the strong relationship between variability and average expression.
FindVariableGenes(object, mean.function = ExpMean,
dispersion.function = LogVMR, do.plot = TRUE, set.var.genes = TRUE,
x.low.cutoff = 0.1, x.high.cutoff = 8, y.cutoff = 1,
y.high.cutoff = Inf, num.bin = 20, do.recalc = TRUE,
sort.results = TRUE, do.cpp = TRUE, display.progress = TRUE, ...)
Seurat object
Function to compute x-axis value (average expression). Default is to take the mean of the detected (i.e. non-zero) values
Function to compute y-axis value (dispersion). Default is to take the standard deviation of all values/
Plot the average/dispersion relationship
Set object@var.genes to the identified variable genes (default is TRUE)
Bottom cutoff on x-axis for identifying variable genes
Top cutoff on x-axis for identifying variable genes
Bottom cutoff on y-axis for identifying variable genes
Top cutoff on y-axis for identifying variable genes
Total number of bins to use in the scaled analysis (default is 20)
TRUE by default. If FALSE, plots and selects variable genes without recalculating statistics for each gene.
If TRUE (by default), sort results in object@hvg.info in decreasing order of dispersion
Run c++ version of mean.function and dispersion.function if they exist.
show progress bar for calculations
Extra parameters to VariableGenePlot
Returns a Seurat object, placing variable genes in object@var.genes. The result of all analysis is stored in object@hvg.info
Exact parameter settings may vary empirically from dataset to dataset, and based on visual inspection of the plot. Setting the y.cutoff parameter to 2 identifies genes that are more than two standard deviations away from the average dispersion within a bin. The default X-axis function is the mean expression level, and for Y-axis it is the log(Variance/mean). All mean/variance calculations are not performed in log-space, but the results are reported in log-space - see relevant functions for exact details.
VariableGenePlot
# NOT RUN {
pbmc_small <- FindVariableGenes(object = pbmc_small, do.plot = FALSE)
pbmc_small@var.genes
# }
Run the code above in your browser using DataLab