Data sets that contain cells from different groups often
benefit from alignment to subtract differences between them. Alignment
can be used to remove batch effects, subtract the effects of treatments,
or even potentially compare across species.
align_cds
executes alignment and stores these adjusted coordinates.
This function can be used to subtract both continuous and discrete batch
effects. For continuous effects, align_cds
fits a linear model to the
cells' PCA or LSI coordinates and subtracts them using Limma. For discrete
effects, you must provide a grouping of the cells, and then these groups are
aligned using Batchelor, a "mutual nearest neighbor" algorithm described in:
Haghverdi L, Lun ATL, Morgan MD, Marioni JC (2018). "Batch effects in single-cell RNA-sequencing data are corrected by matching mutual nearest neighbors." Nat. Biotechnol., 36(5), 421-427. doi: 10.1038/nbt.4091
align_cds(
cds,
preprocess_method = c("PCA", "LSI"),
alignment_group = NULL,
alignment_k = 20,
residual_model_formula_str = NULL,
verbose = FALSE,
...
)
the cell_data_set upon which to perform this operation
a string specifying the low-dimensional space in which to perform alignment, currently either PCA or LSI. Default is "PCA".
String specifying a column of colData to use for aligning groups of cells. The column specified must be a factor. Alignment can be used to subtract batch effects in a non-linear way. For correcting continuous effects, use residual_model_formula_str. Default is NULL.
The value of k used in mutual nearest neighbor alignment
NULL or a string model formula specifying any effects to subtract from the data before dimensionality reduction. Uses a linear model to subtract effects. For non-linear effects, use alignment_group. Default is NULL.
Whether to emit verbose output during dimensionality reduction
additional arguments to pass to limma::lmFit if residual_model_formula is not NULL
an updated cell_data_set object