cmds
obtain the coordinates of the elements in x
in a
k
dimensional space
which best approximate the distances between objects.
For high-throughput sequencing data we define the distance between two
samples as 1 - correlation between their respective coverages.
This provides PCA analog for sequencing data.
cmds(x, k=2, logscale=TRUE, mc.cores=1, cor.method='pearson')
RangedDataList
object, e.g. each element containing the
output of a sequencing run.TRUE
correlations are computed for log(x+1)
.mc.cores>1
allows
running computations in parallel. Setting mc.cores
to too large
a value may require a lot of memory.mdsFit
object, with slots
points
containing the coordinates, d
with the distances
between elements, dapprox
with the distances between objects in
the approximated space, and R.square
indicating the percentage
of variability in d
accounted for by dapprox
.Since the coverage distribution is typically highly asymetric, setting
logscale=TRUE
reduces the influence of the highest coverage
regions in the distance computation, as this is based on the Pearson
correlation coefficient.
signature(x = "RangedDataList")
RangedDataList
object in a k-dimensional space. The coverage is
computed for each element in x
, and the pairwise correlations
between elements is used to define distances. data(htSample)
cmds1 <- cmds(htSample)
cmds1
plot(cmds1)
Run the code above in your browser using DataLab