exactTest(object, pair=1:2, dispersion="auto", rejection.region="doubletail",
big.count=900, prior.count=0.125)
exactTestDoubleTail(y1, y2, dispersion=0, big.count=900)
exactTestBySmallP(y1, y2, dispersion=0)
exactTestByDeviance(y1, y2, dispersion=0)
exactTestBetaApprox(y1, y2, dispersion=0)
DGEList
.object$samples$group
); if numeric, then groups to be compared are chosen by finding the levels of object$samples$group
corresponding to those numeric values and using those levels as the groups to be compared; if NULL
, then first two levels of object$samples$group
(a factor) are used. Note that the first group listed in the pair is the baseline for the comparison---so if the pair is c("A","B")
then the comparison is B - A
, so genes with positive log-fold change are up-regulated in group B compared with group A (and vice versa for genes with negative log-fold change)."common"
, "trended"
, "tagwise"
or "auto"
.
Default behavior ("auto"
is to use most complex dispersions found in data object."doubletail"
, "smallp"
or "deviance"
.equalizeLibSizes
.equalizeLibSizes
. Must have the same number of rows as y1
.exactTest
produces an object of class DGEExact
containing the following components:logFC
, the average log2-counts-per-million, logCPM
, and the two-sided p-value PValue
object
exactTestDoubleTail
etc, produce a numeric vector of genewise p-values, one for each row of y1
and y2
.binomTest
) but generalized to overdispersed counts.exactTest
is the main user-level function, and produces an object containing all the necessary components for downstream analysis.
exactTest
calls one of the low level functions exactTestDoubleTail
, exactTestBetaApprox
, exactTestBySmallP
or exactTestByDeviance
to do the p-value computation.
The low level functions all assume that the libraries have been normalized to have the same size, i.e., to have the same expected column sum under the null hypothesis.
exactTest
equalizes the library sizes using equalizeLibSizes
before calling the low level functions.
The functions exactTestDoubleTail
, exactTestBySmallP
and exactTestByDeviance
correspond to different ways to define the two-sided rejection region when the two groups have different numbers of samples.
exactTestBySmallP
implements the method of small probabilities as proposed by Robinson and Smyth (2008).
This method corresponds exactly to binomTest
as the dispersion approaches zero, but gives poor results when the dispersion is very large.
exactTestDoubleTail
computes two-sided p-values by doubling the smaller tail probability.
exactTestByDeviance
uses the deviance goodness of fit statistics to define the rejection region, and is therefore equivalent to a conditional likelihood ratio test.
Note that rejection.region="smallp"
is no longer recommended.
It is preserved as an option only for backward compatiblity with early versions of edgeR.
rejection.region="deviance"
has good theoretical statistical properties but is relatively slow to compute.
rejection.region="doubletail"
is just slightly more conservative than rejection.region="deviance"
, but is recommended because of its much greater speed.
For general remarks on different types of rejection regions for exact tests see Gibbons and Pratt (1975).
exactTestBetaApprox
implements an asymptotic beta distribution approximation to the conditional count distribution.
It is called by the other functions for rows with both group counts greater than big.count
.
Gibbons, JD and Pratt, JW (1975). P-values: interpretation and methodology. The American Statistician 29, 20-25.
equalizeLibSizes
, binomTest
# generate raw counts from NB, create list object
y <- matrix(rnbinom(80,size=1/0.2,mu=10),nrow=20,ncol=4)
d <- DGEList(counts=y, group=c(1,1,2,2), lib.size=rep(1000,4))
de <- exactTest(d, dispersion=0.2)
topTags(de)
# same p-values using low-level function directly
p.value <- exactTestDoubleTail(y[,1:2], y[,3:4], dispersion=0.2)
sort(p.value)[1:10]
Run the code above in your browser using DataLab