This function constructs simultaneous principal curves, the
second step in Slingshot's trajectory inference procedure. It takes a
(specifically formatted) PseudotimeOrdering
object, as is returned by the first step, getLineages
. The
output is another PseudotimeOrdering
object, containing the
simultaneous principal curves, pseudotime estimates, and lineage assignment
weights.
getCurves(data, ...)# S4 method for PseudotimeOrdering
getCurves(
data,
shrink = TRUE,
extend = "y",
reweight = TRUE,
reassign = TRUE,
thresh = 0.001,
maxit = 15,
stretch = 2,
approx_points = NULL,
smoother = "smooth.spline",
shrink.method = "cosine",
allow.breaks = TRUE,
...
)
# S4 method for SingleCellExperiment
getCurves(data, ...)
# S4 method for SlingshotDataSet
getCurves(data, ...)
a data object containing lineage information provided by
getLineages
, to be used for constructing simultaneous
principal curves. Supported types include
SingleCellExperiment
, SlingshotDataSet
, and
PseudotimeOrdering
(recommended).
Additional parameters to pass to scatter plot smoothing function,
smoother
.
logical or numeric between 0 and 1, determines whether and how
much to shrink branching lineages toward their average prior to the split
(default = TRUE
).
character, how to handle root and leaf clusters of lineages
when constructing the initial, piece-wise linear curve. Accepted values are
'y'
(default), 'n'
, and 'pc1'
. See 'Details' for more.
logical, whether to allow cells shared between lineages to be
reweighted during curve fitting. If TRUE
(default), cells shared
between lineages will be iteratively reweighted based on the quantiles of
their projection distances to each curve. See 'Details' for more.
logical, whether to reassign cells to lineages at each
iteration. If TRUE
(default), cells will be added to a lineage when
their projection distance to the curve is less than the median distance for
all cells currently assigned to the lineage. Additionally, shared cells
will be removed from a lineage if their projection distance to the curve is
above the 90th percentile and their weight along the curve is less than
0.1
.
numeric, determines the convergence criterion. Percent change
in the total distance from cells to their projections along curves must be
less than thresh
. Default is 0.001
, similar to
principal_curve
.
numeric, maximum number of iterations (default = 15
), see
principal_curve
.
numeric factor by which curves can be extrapolated beyond
endpoints. Default is 2
, see
principal_curve
.
numeric, whether curves should be approximated by a
fixed number of points. If FALSE
(or 0), no approximation will be
performed and curves will contain as many points as the input data. If
numeric, curves will be approximated by this number of points (default
= 150
or #cells
, whichever is smaller). See 'Details' and
principal_curve
for more.
choice of scatter plot smoother. Same as
principal_curve
, but "lowess"
option is
replaced with "loess"
for additional flexibility.
character denoting how to determine the appropriate
amount of shrinkage for a branching lineage. Accepted values are the same
as for kernel
in density
(default is "cosine"
),
as well as "tricube"
and "density"
. See 'Details' for more.
logical, determines whether curves that branch very close to the origin should be allowed to have different starting points.
An updated PseudotimeOrdering
object containing the
pseudotime estimates and lineage assignment weights in the assays
.
It will also include the original information provided by
getLineages
, as well as the following new elements in the
metadata
:
curves
A list of
principal_curve
objects.
slingParams
Additional parameters used for fitting
simultaneous principal curves.
This function constructs simultaneous principal curves (one per
lineage). Cells are mapped to curves by orthogonal projection and
pseudotime is estimated by the arclength along the curve (also called
lambda
, in the principal_curve
objects).
When there is only a single lineage, the curve-fitting algorithm is
nearly identical to that of principal_curve
. When
there are multiple lineages and shrink > 0
, an additional step
is added to the iterative procedure, forcing curves to be similar in the
neighborhood of shared points (ie., before they branch).
The approx_points
argument, which sets the number of points
to be used for each curve, can have a large effect on computation time. Due
to this consideration, we set the default value to 150
whenever the
input dataset contains more than that many cells. This setting should help
with exploratory analysis while having little to no impact on the final
curves. To disable this behavior and construct curves with the maximum
number of points, set approx_points = FALSE
.
The extend
argument determines how to construct the
piece-wise linear curve used to initiate the recursive algorithm. The
initial curve is always based on the lines between cluster centers and if
extend = 'n'
, this curve will terminate at the center of the
endpoint clusters. Setting extend = 'y'
will allow the first and
last segments to extend beyond the cluster center to the orthogonal
projection of the furthest point. Setting extend = 'pc1'
is similar
to 'y'
, but uses the first principal component of the cluster to
determine the direction of the curve beyond the cluster center. These
options typically have limited impact on the final curve, but can
occasionally help with stability issues.
When shink = TRUE
, we compute a percent shrinkage curve,
\(w_l(t)\), for each lineage, a non-increasing function of pseudotime
that determines how much that lineage should be shrunk toward a shared
average curve. We set \(w_l(0) = 1\) (complete shrinkage), so that the
curves will always perfectly overlap the average curve at pseudotime
0
. The weighting curve decreases from 1
to 0
over the
non-outlying pseudotime values of shared cells (where outliers are defined
by the 1.5*IQR
rule). The exact shape of the curve in this region is
controlled by shrink.method
, and can follow the shape of any
standard kernel function's cumulative density curve (or more precisely,
survival curve, since we require a decreasing function). Different choices
of shrink.method
to have no discernable impact on the final curves,
in most cases.
When reweight = TRUE
, weights for shared cells are based on
the quantiles of their projection distances onto each curve. The
distances are ranked and converted into quantiles between 0
and
1
, which are then transformed by 1 - q^2
. Each cell's weight
along a given lineage is the ratio of this value to the maximum value for
this cell across all lineages.
Hastie, T., and Stuetzle, W. (1989). "Principal Curves." Journal of the American Statistical Association, 84:502--516.
# NOT RUN {
data("slingshotExample")
rd <- slingshotExample$rd
cl <- slingshotExample$cl
pto <- getLineages(rd, cl, start.clus = '1')
pto <- getCurves(pto)
# plotting
sds <- as.SlingshotDataSet(pto)
plot(rd, col = cl, asp = 1)
lines(sds, type = 'c', lwd = 3)
# }
Run the code above in your browser using DataLab