Finds markers (differentially expressed genes) for identity classes
FindMarkers(object, ident.1, ident.2 = NULL, genes.use = NULL,
logfc.threshold = 0.25, test.use = "wilcox", min.pct = 0.1,
min.diff.pct = -Inf, print.bar = TRUE, only.pos = FALSE,
max.cells.per.ident = Inf, random.seed = 1, latent.vars = "nUMI",
min.cells = 3, pseudocount.use = 1, assay.type = "RNA", ...)
Seurat object
Identity class to define markers for
A second identity class for comparison. If NULL (default) - use all other cells for comparison.
Genes to test. Default is to use all genes
Limit testing to genes which show, on average, at least X-fold difference (log-scale) between the two groups of cells. Default is 0.25 Increasing logfc.threshold speeds up the function, but can miss weaker signals.
Denotes which test to use. Available options are:
"wilcox" : Wilcoxon rank sum test (default)
"bimod" : Likelihood-ratio test for single cell gene expression, (McDavid et al., Bioinformatics, 2013)
"roc" : Standard AUC classifier
"t" : Student's t-test
"tobit" : Tobit-test for differential gene expression (Trapnell et al., Nature Biotech, 2014)
"poisson" : Likelihood ratio test assuming an underlying poisson distribution. Use only for UMI-based datasets
"negbinom" : Likelihood ratio test assuming an underlying negative binomial distribution. Use only for UMI-based datasets
"MAST : GLM-framework that treates cellular detection rate as a covariate (Finak et al, Genome Biology, 2015)
"DESeq2 : DE based on a model using the negative binomial distribution (Love et al, Genome Biology, 2014)
only test genes that are detected in a minimum fraction of min.pct cells in either of the two populations. Meant to speed up the function by not testing genes that are very infrequently expressed. Default is 0.1
only test genes that show a minimum difference in the fraction of detection between the two groups. Set to -Inf by default
Print a progress bar once expression testing begins (uses pbapply to do this)
Only return positive markers (FALSE by default)
Down sample each identity class to a max number. Default is no downsampling. Not activated by default (set to Inf)
Random seed for downsampling
Variables to test
Minimum number of cells expressing the gene in at least one of the two groups
Pseudocount to add to averaged expression values when calculating logFC. 1 by default.
Type of assay to fetch data for (default is RNA)
Additional parameters to pass to specific DE functions
Matrix containing a ranked list of putative markers, and associated statistics (p-values, ROC score, etc.)
p-value adjustment is performed using bonferroni correction based on the total number of genes in the dataset. Other correction methods are not recommended, as Seurat pre-filters genes using the arguments above, reducing the number of tests performed. Lastly, as Aaron Lun has pointed out, p-values should be interpreted cautiously, as the genes used for clustering are the same genes tested for differential expression.
MASTDETest
, and DESeq2DETest
for more information on these methods
# NOT RUN {
markers <- FindMarkers(object = pbmc_small, ident.1 = 3)
head(markers)
# }
Run the code above in your browser using DataLab