Identifies features that are outliers on a 'mean variability plot'.
FindVariableFeatures(object, ...)# S3 method for default
FindVariableFeatures(object, selection.method = "vst",
loess.span = 0.3, clip.max = "auto", mean.function = FastExpMean,
dispersion.function = FastLogVMR, num.bin = 20,
binning.method = "equal_width", verbose = TRUE, ...)
# S3 method for Assay
FindVariableFeatures(object, selection.method = "vst",
loess.span = 0.3, clip.max = "auto", mean.function = FastExpMean,
dispersion.function = FastLogVMR, num.bin = 20,
binning.method = "equal_width", nfeatures = 2000,
mean.cutoff = c(0.1, 8), dispersion.cutoff = c(1, Inf),
verbose = TRUE, ...)
# S3 method for Seurat
FindVariableFeatures(object, assay = NULL,
selection.method = "vst", loess.span = 0.3, clip.max = "auto",
mean.function = FastExpMean, dispersion.function = FastLogVMR,
num.bin = 20, binning.method = "equal_width", nfeatures = 2000,
mean.cutoff = c(0.1, 8), dispersion.cutoff = c(1, Inf),
verbose = TRUE, ...)
An object
Arguments passed to other methods
How to choose top variable features. Choose one of :
vst: First, fits a line to the relationship of log(variance) and log(mean) using local polynomial regression (loess). Then standardizes the feature values using the observed mean and expected variance (given by the fitted line). Feature variance is then calculated on the standardized values after clipping to a maximum (see clip.max parameter).
mean.var.plot (mvp): First, uses a function to calculate average expression (mean.function) and dispersion (dispersion.function) for each feature. Next, divides features into num.bin (deafult 20) bins based on their average expression, and calculates z-scores for dispersion within each bin. The purpose of this is to identify variable features while controlling for the strong relationship between variability and average expression.
dispersion (disp): selects the genes with the highest dispersion values
(vst method) Loess span parameter used when fitting the variance-mean relationship
(vst method) After standardization values larger than clip.max will be set to clip.max; default is 'auto' which sets this value to the square root of the number of cells
Function to compute x-axis value (average expression). Default is to take the mean of the detected (i.e. non-zero) values
Function to compute y-axis value (dispersion). Default is to take the standard deviation of all values
Total number of bins to use in the scaled analysis (default is 20)
Specifies how the bins should be computed. Available methods are:
equal_width: each bin is of equal width along the x-axis [default]
equal_frequency: each bin contains an equal number of features (can increase statistical power to detect overdispersed features at high expression values, at the cost of reduced resolution along the x-axis)
show progress bar for calculations
Number of features to select as top variable features;
only used when selection.method
is set to 'dispersion'
or
'vst'
A two-length numeric vector with low- and high-cutoffs for feature means
A two-length numeric vector with low- and high-cutoffs for feature dispersions
Assay to use
For the mean.var.plot method: Exact parameter settings may vary empirically from dataset to dataset, and based on visual inspection of the plot. Setting the y.cutoff parameter to 2 identifies features that are more than two standard deviations away from the average dispersion within a bin. The default X-axis function is the mean expression level, and for Y-axis it is the log(Variance/mean). All mean/variance calculations are not performed in log-space, but the results are reported in log-space - see relevant functions for exact details.