Learn R Programming

ClassDiscovery (version 3.4.8)

distanceMatrix: Distance Matrix Computation

Description

This function computes and returns the distance matrix determined by using the specified distance metric to compute the distances between the columns of a data matrix.

Usage

distanceMatrix(dataset, metric, ...)

Value

A distance matrix in the form of an object of class dist, of the sort returned by the dist function or the as.dist

function.

Arguments

dataset

A numeric matrix or an ExpressionSet

metric

A character string defining the distance metric. This can be pearson, sqrt pearson, spearman, absolute pearson, uncentered correlation, weird, cosine, or any of the metrics accepted by the dist function. At present, the latter function accepts euclidean, maximum, manhattan, canberra, binary, or minkowski. Any initial substring that uniquely defines one of the metrics will work.

...

Additional parameters to be passed on to dist.

Author

Kevin R. Coombes krc@silicovore.com

BUGS

It would be good to have a better name for the weird metric.

Details

This function differs from dist in two ways, both of which are motivated by common practice in the analysis of microarray or proteomics data. First, it computes distances between column vectors instead of between row vectors. In a typical microarray experiment, the data is organized so the rows represent genes and the columns represent different biological samples. In many applications, relations between the biological samples are more interesting than relationships between genes. Second, distanceMatrix adds additional distance metrics based on correlation.

pearson

The most common metric used in the microarray literature is the pearson distance, which can be computed in terms of the Pearson correlation coefficient as (1-cor(dataset))/2.

uncentered correlation

This metric was introduced in the Cluster and TreeView software from the Eisen lab at Stanford. It is computed using the formulas for Pearson correlation, but assuming that both vectors have mean zero.

spearman

The spearman metric used the same formula, but substitutes the Spearman rank correlation for the Pearson correlation.

absolute pearson

The absolute pearson metric used the absolute correlation coefficient; i.e., (1-abs(cor(dataset))).

sqrt pearson

The sqrt pearson metric used the square root of the pearson distance metric; i.e., sqrt(1-cor(dataset)).

weird

The weird metric uses the Euclidean distance between the vectors of correlation coefficients; i.e., dist(cor(dataset)).

See Also

Examples

Run this code
dd <- matrix(rnorm(100*5, rnorm(100)), nrow=100, ncol=5)
distanceMatrix(dd, 'pearson')
distanceMatrix(dd, 'euclid')
distanceMatrix(dd, 'sqrt')
distanceMatrix(dd, 'weird')
rm(dd) # cleanup

Run the code above in your browser using DataLab