calc.locallod: Calculate LOD score at physical position of each gene

Description

For gene expression data with physical positions of the genes, calculate the LOD score at those positions to assess evidence for local eQTL.

Usage

calc.locallod(cross, pheno, pmark, addcovar=NULL, intcovar=NULL,
              verbose=TRUE)

Arguments

cross

An object of class "cross" containing data for a QTL experiment. See the help file for read.cross in the R/qtl package (http://www.rqtl.org). There must be a phenotype nam

pheno

A data frame of phenotypes (generally gene expression data), stored as individuals x phenotypes. The row names must contain individual identifiers.

pmark

Pseudomarkers that are closest to the genes in pheno, as output by find.gene.pseudomarker.

addcovar

Additive covariates passed to scanone.

intcovar

Interactive covariates passed to scanone.

verbose

If TRUE, print tracing information.

Value

A vector of LOD scores. The names indicate the gene names (rows in pheno).

Details

cross and pheno must contain exactly the same individuals in the same order. (Use findCommonID to line them up.)

We consider the expression phenotypes in batches: those whose closest pseudomarker is the same.

We use Haley-Knott regression to calculate the LOD scores.

Actually, we use a bit of a contortion of the data to force the scanone function in R/qtl to calculate the LOD score at a single position.

We omit any transcripts that map to the X chromosome; we can only handle autosomal loci for now.

Examples

Run this code

##############################
# simulate an eQTL data set
##############################
# genetic map
L <- seq(120, length=8, by=-10)
map <- sim.map(L, n.mar=L/10+1, include.x=FALSE, eq.spacing=TRUE)

# physical map: make all intervals 2x longer
pmap <- rescalemap(map, 2)

# arbitrary locations of 40 local eQTL
thepos <- unlist(map)
theppos <- unlist(pmap)
thechr <- rep(seq(along=map), sapply(map, length))
eqtl.loc <- sort(sample(seq(along=thepos), 40))

x <- sim.cross(map, n.ind=250, type="f2",
               model=cbind(thechr[eqtl.loc], thepos[eqtl.loc], 0, 0))
x$pheno$id <- factor(paste("Mouse", 1:250, sep=""))

# first 20 have eQTL with huge effects
# second 20 have essentially no effect
edata <- cbind((x$qtlgeno[,1:20] - 2)*10+rnorm(prod(dim(x$qtlgeno[,1:20]))),
               (x$qtlgeno[,21:40] - 2)*0.1+rnorm(prod(dim(x$qtlgeno[,21:40]))))
dimnames(edata) <- list(x$pheno$id, paste("e", 1:ncol(edata), sep=""))

# gene locations
theloc <- data.frame(chr=thechr[eqtl.loc], pos=theppos[eqtl.loc])
rownames(theloc) <- colnames(edata)

# mix up 5 individuals in expression data
edata[1:3,] <- edata[c(2,3,1),]
edata[4:5,] <- edata[5:4,]

##############################
# now, the start of the analysis
##############################
x <- calc.genoprob(x, step=1)

# find nearest pseudomarkers
pmark <- find.gene.pseudomarker(x, pmap, theloc, "prob")

# calculate LOD score for local eQTL
locallod <- calc.locallod(x, edata, pmark)

# take those with LOD > 100 [which will be the first 20]
edatasub <- edata[,locallod>100,drop=FALSE]

# calculate distance between individuals
#     (prop'n mismatches between obs and inferred eQTL geno)
d <- disteg(x, edatasub, pmark)

# plot distances
plot(d)

# summary of apparent mix-ups
summary(d)

# plot of classifier for first eQTL
plotEGclass(d)

Run the code above in your browser using DataLab