camera: Competitive Gene Set Test Accounting for Inter-gene Correlation

Description

Test whether a set of genes is highly ranked relative to other genes in terms of differential expression, accounting for inter-gene correlation.

Usage

"camera"(y, index, design, contrast = ncol(design), weights = NULL, use.ranks = FALSE, allow.neg.cor=FALSE, inter.gene.cor=0.01, trend.var = FALSE, sort = TRUE, ...)
interGeneCorrelation(y, design)

Arguments

a numeric matrix of log-expression values or log-ratios of expression values, or any data object containing such a matrix. Rows correspond to probes and columns to samples. Any type of object that can be processed by getEAWP is acceptable.

index

an index vector or a list of index vectors. Can be any vector such that y[index,] selects the rows corresponding to the test set. The list can be made using ids2indices.

design

design matrix.

contrast

contrast of the linear model coefficients for which the test is required. Can be an integer specifying a column of design, or else a numeric vector of same length as the number of columns of design.

weights

can be a numeric matrix of individual weights, of same size as y, or a numeric vector of array weights with length equal to ncol(y), or a numeric vector of gene weights with length equal to nrow(y).

use.ranks

do a rank-based test (TRUE) or a parametric test (FALSE)?

allow.neg.cor

should reduced variance inflation factors be allowed for negative correlations?

inter.gene.cor

numeric, optional preset value for the inter-gene correlation within tested sets. If NA or NULL, then an inter-gene correlation will be estimated for each tested set.

trend.var

logical, should an empirical Bayes trend be estimated? See eBayes for details.

sort

logical, should the results be sorted by p-value?

...

other arguments are not currently used

Value

NGenes: number of genes in set.
Correlation: inter-gene correlation (only included if the inter.gene.cor was not preset).
Direction: direction of change ("Up" or "Down").
PValue: two-tailed p-value.
FDR: Benjamini and Hochberg FDR adjusted p-value.
vif: variance inflation factor.
correlation: inter-gene correlation.

Details

camera and interGeneCorrelation implement methods proposed by Wu and Smyth (2012). camera performs a competitive test in the sense defined by Goeman and Buhlmann (2007). It tests whether the genes in the set are highly ranked in terms of differential expression relative to genes not in the set. It has similar aims to geneSetTest but accounts for inter-gene correlation. See roast for an analogous self-contained gene set test.

The function can be used for any microarray experiment which can be represented by a linear model. The design matrix for the experiment is specified as for the lmFit function, and the contrast of interest is specified as for the contrasts.fit function. This allows users to focus on differential expression for any coefficient or contrast in a linear model by giving the vector of test statistic values.

camera estimates p-values after adjusting the variance of test statistics by an estimated variance inflation factor. The inflation factor depends on estimated genewise correlation and the number of genes in the gene set.

By default, camera uses interGeneCorrelation to estimate the mean pair-wise correlation within each set of genes. camera can alternatively be used with a preset correlation specified by inter.gene.cor that is shared by all sets. This usually works best with a small value, say inter.gene.cor=0.01.

If interGeneCorrelation=NA, then camera will estimate the inter-gene correlation for each set. In this mode, camera gives rigorous error rate control for all sample sizes and all gene sets. However, in this mode, highly co-regulated gene sets that are biological interpretable may not always be ranked at the top of the list.

With interGeneCorrelation=0.01, camera will rank biologically interpetable sets more highly. This gives a useful compromise between strict error rate control and interpretable gene set rankings.

References

Wu, D, and Smyth, GK (2012). Camera: a competitive gene set test accounting for inter-gene correlation. Nucleic Acids Research 40, e133. http://nar.oxfordjournals.org/content/40/17/e133

Goeman, JJ, and Buhlmann, P (2007). Analyzing gene expression data in terms of gene sets: methodological issues. Bioinformatics 23, 980-987.

Examples

Run this code

y <- matrix(rnorm(1000*6),1000,6)
design <- cbind(Intercept=1,Group=c(0,0,0,1,1,1))

# First set of 20 genes are genuinely differentially expressed
index1 <- 1:20
y[index1,4:6] <- y[index1,4:6]+1

# Second set of 20 genes are not DE
index2 <- 21:40
 
camera(y, index1, design)
camera(y, index2, design)

camera(y, list(set1=index1,set2=index2), design, inter.gene.cor=NA)
camera(y, list(set1=index1,set2=index2), design, inter.gene.cor=0.01)

Run the code above in your browser using DataLab