Learn R Programming

lineup (version 0.44)

distee: Calculate distance between two gene expression data sets

Description

Calculate a distance between all pairs of individuals for two gene expression data sets

Usage

distee(
  e1,
  e2 = NULL,
  d.method = c("rmsd", "cor"),
  labels = c("e1", "e2"),
  verbose = TRUE
)

Value

A matrix with nrow(e1) rows and nrow(e2) columns, containing the distances. The individual IDs are in the row and column names. The matrix is assigned class "lineupdist".

Arguments

e1

Numeric matrix of gene expression data, as individuals x genes. The row and column names must contain individual and gene identifiers.

e2

(Optional) Like e1. An appreciable number of individuals and genes must be in common.

d.method

Calculate inter-individual distance as RMS difference or as correlation.

labels

Two character strings, to use as labels for the two data matrices in subsequent output.

verbose

if TRUE, give verbose output.

Author

Karl W Broman, broman@wisc.edu

Details

We calculate the pairwise distance between all individuals (rows) in e1 and all individuals in e2. This distance is either the RMS difference (d.method="rmsd") or the correlation (d.method="cor").

See Also

pulldiag(), omitdiag(), summary.lineupdist(), plot2dist(), disteg(), corbetw2mat()

Examples

Run this code

# load the example data
data(expr1, expr2)
expr1 <- expr1[,1:100]; expr2 <- expr2[,1:100]

# find samples in common
id <- findCommonID(expr1, expr2)

# calculate correlations between cols of x and cols of y
thecor <- corbetw2mat(expr1[id$first,], expr2[id$second,])

# subset at genes with corr > 0.8 and scale values
expr1s <- expr1[,thecor > 0.8]/1000
expr2s <- expr2[,thecor > 0.8]/1000

# calculate distance (using "RMS difference" as a measure)
d1 <- distee(expr1s, expr2s, d.method="rmsd", labels=c("1","2"))

# calculate distance (using "correlation" as a measure...really similarity)
d2 <- distee(expr1s, expr2s, d.method="cor", labels=c("1", "2"))

# pull out the smallest 8 self-self correlations
sort(pulldiag(d2))[1:8]

# summary of results
summary(d1)
summary(d2)

# plot histograms of RMS distances
plot(d1)

# plot histograms of correlations
plot(d2)

# plot distances against one another
plot2dist(d1, d2)

Run the code above in your browser using DataLab