This metric quantifies how well-aligned two or more datasets are. We randomly downsample all datasets to have as many cells as the smallest one. We construct a nearest-neighbor graph and calculate for each cell how many of its neighbors are from the same dataset. We average across all cells and compare to the expected value for perfectly mixed datasets, and scale the value from 0 to 1. Note that in practice, alignment can be greater than 1 occasionally.
calcAlignment(
object,
clustersUse = NULL,
clusterVar = NULL,
nNeighbors = NULL,
cellIdx = NULL,
cellComp = NULL,
resultBy = c("all", "dataset", "cell"),
seed = 1,
k = nNeighbors,
rand.seed = seed,
cells.use = cellIdx,
cells.comp = cellComp,
clusters.use = clustersUse,
by.cell = NULL,
by.dataset = NULL
)
The alignment metric.
A liger object, with quantileNorm
already run.
The clusters to consider for calculating the alignment.
Should be a vector of existing levels in clusterVar
. Default
NULL
. See Details.
The name of one variable in cellMeta(object)
.
Default NULL
uses default clusters.
Number of neighbors to use in calculating alignment.
Default NULL
uses floor(0.01*ncol(object))
, with a lower bound
of 10 in all cases except where the total number of sampled cells is less
than 10.
Character, logical or numeric index that can
subscribe cells. Default NULL
. See Details.
Select from "all"
, "dataset"
or "cell"
.
On which level should the mean alignment be calculated. Default "all"
.
Random seed to allow reproducible results. Default 1
.
[Deprecated] Please see Usage for replacement.
[Defunct] Use resultBy
instead.
\(\bar{x}\) is the average number of neighbors belonging to any cells' same dataset, \(N\) is the number of datasets, \(k\) is the number of neighbors in the KNN graph. $$1 - \frac{\bar{x} - \frac{k}{N}}{k - \frac{k}{N}}$$
The selection on cells to be measured can be done in various way and represent different scenarios:
By default, all cells are considered and the alignment across all datasets will be calculated.
Select clustersUse
from clusterVar
to use cells from the
clusters of interests. This measures the alignment across all covered
datasets within the specified clusters.
Only Specify cellIdx
for flexible selection. This measures the
alignment across all covered datasets within the specified cells. A none-NULL
cellIdx
privileges over clustersUse
.
Specify cellIdx
and cellComp
at the same time, so that
the original dataset source will be ignored and cells specified by each
argument will be regarded as from each a dataset. This measures the alignment
between cells specified by the two arguments. cellComp
can contain
cells already specified in cellIdx
.
if (requireNamespace("RcppPlanc", quietly = TRUE)) {
pbmc <- pbmc %>%
normalize %>%
selectGenes %>%
scaleNotCenter %>%
runINMF %>%
quantileNorm
calcAlignment(pbmc)
}
Run the code above in your browser using DataLab