cmds: Classical Multi-Dimensional Scaling

Description

cmds obtain the coordinates of the elements in x in a k dimensional space which best approximate the distances between objects. For high-throughput sequencing data we define the distance between two samples as 1 - correlation between their respective coverages. This provides PCA analog for sequencing data.

Usage

cmds(x, k=2, logscale=TRUE, mc.cores=1, cor.method='pearson')

Arguments

A RangedDataList object, e.g. each element containing the output of a sequencing run.

Dimensionality of the reconstructed space, typically set to 2 or 3.

logscale

If set to TRUE correlations are computed for log(x+1).

mc.cores

Number of cores. Setting mc.cores>1 allows running computations in parallel. Setting mc.cores to too large a value may require a lot of memory.

cor.method

A character string indicating which correlation coefficient (or covariance) is to be computed. One of "pearson" (default), "kendall", or "spearman", can be abbreviated.

Value

The function returns a mdsFit object, with slots points containing the coordinates, d with the distances between elements, dapprox with the distances between objects in the approximated space, and R.square indicating the percentage of variability in d accounted for by dapprox.Since the coverage distribution is typically highly asymetric, setting logscale=TRUE reduces the influence of the highest coverage regions in the distance computation, as this is based on the Pearson correlation coefficient.

Methods

signature(x = "RangedDataList"): Use Classical Multi-Dimensional Scaling to plot each element of the RangedDataList object in a k-dimensional space. The coverage is computed for each element in x, and the pairwise correlations between elements is used to define distances.

Examples

Run this code

data(htSample)
cmds1 <- cmds(htSample)

cmds1
plot(cmds1)

Run the code above in your browser using DataLab