This function calculates an adaptive inflection point ("knee") of the barcode distribution for each sample group. This is useful for determining a threshold for removing low-quality samples.
CalculateBarcodeInflections(object, barcode.column = "nCount_RNA",
group.column = "orig.ident", threshold.low = NULL,
threshold.high = NULL)
Seurat object
Column to use as proxy for barcodes ("nCount_RNA" by default)
Column to group by ("orig.ident" by default)
Ignore barcodes of rank below this threshold in inflection calculation
Ignore barcodes of rank above thisf threshold in inflection calculation
Returns Seurat object with a new list in the `tools` slot, `CalculateBarcodeInflections` with values:
* `barcode_distribution` - contains the full barcode distribution across the entire dataset * `inflection_points` - the calculated inflection points within the thresholds * `threshold_values` - the provided (or default) threshold values to search within for inflections * `cells_pass` - the cells that pass the inflection point calculation
The function operates by calculating the slope of the barcode number vs. rank distribution, and then finding the point at which the distribution changes most steeply (the "knee"). Of note, this calculation often must be restricted as to the range at which it performs, so `threshold` parameters are provided to restrict the range of the calculation based on the rank of the barcodes. [BarcodeInflectionsPlot()] is provided as a convenience function to visualize and test different thresholds and thus provide more sensical end results.
See [BarcodeInflectionsPlot()] to visualize the calculated inflection points and [SubsetByBarcodeInflections()] to subsequently subset the Seurat object.
# NOT RUN {
CalculateBarcodeInflections(pbmc_small, group.column = 'groups')
# }
Run the code above in your browser using DataLab