This function constructs the minimum spanning tree(s) on clusters of cells, the first step in Slingshot's trajectory inference procedure. Paths through the MST from an origin cluster to leaf node clusters are interpreted as lineages.
getLineages(data, clusterLabels, ...)# S4 method for matrix,matrix
getLineages(
data,
clusterLabels,
reducedDim = NULL,
start.clus = NULL,
end.clus = NULL,
dist.method = "slingshot",
use.median = FALSE,
omega = FALSE,
omega_scale = 1.5,
times = NULL,
...
)
# S4 method for matrix,character
getLineages(data, clusterLabels, ...)
# S4 method for matrix,ANY
getLineages(data, clusterLabels, ...)
# S4 method for SlingshotDataSet,ANY
getLineages(data, clusterLabels, ...)
# S4 method for PseudotimeOrdering,ANY
getLineages(data, clusterLabels, ...)
# S4 method for data.frame,ANY
getLineages(data, clusterLabels, ...)
# S4 method for matrix,numeric
getLineages(data, clusterLabels, ...)
# S4 method for matrix,factor
getLineages(data, clusterLabels, ...)
# S4 method for SingleCellExperiment,ANY
getLineages(data, clusterLabels, reducedDim = NULL, ...)
a data object containing the matrix of coordinates to be used for
lineage inference. Supported types include matrix
,
SingleCellExperiment
, SlingshotDataSet
, and
PseudotimeOrdering
.
each cell's cluster assignment. This can be a single
vector of labels, or a #cells
by #clusters
matrix
representing weighted cluster assignment. Either representation may
optionally include a "-1"
group meaning "unclustered."
Additional arguments to specify how lineages are constructed from clusters.
(optional) the dimensionality reduction to be used. Can be
a matrix or a character identifying which element of
reducedDim(data)
is to be used. If multiple dimensionality
reductions are present and this argument is not provided, the first element
will be used by default.
(optional) character, indicates the starting cluster(s) from which lineages will be drawn.
(optional) character, indicates which cluster(s) will be forced to be leaf nodes in the graph.
(optional) character, specifies the method for calculating
distances between clusters. Default is "slingshot"
, see
createClusterMST
for details.
logical, whether to use the median (instead of mean) when calculating cluster centroid coordinates.
(optional) numeric or logical, this granularity parameter
determines the distance between every real cluster and the artificial
cluster, .OMEGA
. In practice, this makes omega
the maximum
allowable distance between two connected clusters. By default, omega
= Inf
. If omega = TRUE
, the maximum edge length will be set to the
median edge length of the unsupervised MST times a scaling factor
(omega_scale
, default = 1.5
). This value is provided as a
potentially useful rule of thumb for datasets with outlying clusters or
multiple, distinct trajectories. See outgroup
in
createClusterMST
.
(optional) numeric, scaling factor to use when omega
= TRUE
. The maximum edge length will be set to the median edge length of
the unsupervised MST times omega_scale
(default = 3
). See
outscale
in createClusterMST
.
numeric, vector of external times associated with either
clusters or cells. See defineMSTPaths
for
details.
An object of class PseudotimeOrdering
. Although the
final pseudotimes have not yet been calculated, the assay slot of this
object contains two elements: pseudotime
, a matrix of NA
values; and weights
, a preliminary matrix of lineage assignment
weights. The reducedDim
and clusterLabels
matrices will be
stored in the cellData
. Additionally, the
metadata
slot will contain an igraph
object
named mst
, a list of parameter values named slingParams
, and
a list of lineages (ordered sets of clusters) named lineages
.
Given a reduced-dimension data matrix n
by p
and a set
of cluster identities (potentially including a "-1"
group for
"unclustered"), this function infers a tree (or forest) structure on the
clusters. This work is now mostly handled by the more general function,
createClusterMST
.
The graph of this structure is learned by fitting a (possibly
constrained) minimum-spanning tree on the clusters, plus the artificial
cluster, .OMEGA
, which is a fixed distance away from every real
cluster. This effectively limits the maximum branch length in the MST to
the chosen distance, meaning that the output may contain multiple trees.
Once the graph is known, lineages are identified in any tree with at least two clusters. For a given tree, if there is an annotated starting cluster, every possible path out of a starting cluster and ending in a leaf that isn't another starting cluster will be returned. If no starting cluster is annotated, one will be chosen by a heuristic method, but this is not recommended.
# NOT RUN {
data("slingshotExample")
rd <- slingshotExample$rd
cl <- slingshotExample$cl
pto <- getLineages(rd, cl, start.clus = '1')
# plotting
sds <- as.SlingshotDataSet(pto)
plot(rd, col = cl, asp = 1)
lines(sds, type = 'l', lwd = 3)
# }
Run the code above in your browser using DataLab