"glmQLFit"(y, design=NULL, dispersion=NULL, offset=NULL, abundance.trend=TRUE, robust=FALSE, winsor.tail.p=c(0.05, 0.1), ...)
"glmQLFit"(y, design=NULL, dispersion=NULL, offset=NULL, lib.size=NULL, abundance.trend=TRUE, AveLogCPM=NULL, robust=FALSE, winsor.tail.p=c(0.05, 0.1), ...)
glmQLFTest(glmfit, coef=ncol(glmfit$design), contrast=NULL, poisson.bound=TRUE)
DGEList
object with (at least) elements counts
(table of unadjusted counts) and samples
(data frame containing information about experimental group, library size and normalization factor for the library size)NULL
, then will be extracted from the DGEList
object y
, with order of precedence: trended dispersions, common dispersion, a constant value of 0.05.y
giving offsets for the log-linear models. Can be a scalor or a vector of length ncol(y)
, in which case it is expanded out to a matrix. If NULL
will be computed by getOffset(y)
.ncol(y)
giving library sizes. Only used if offset=NULL
, in which case offset
is set to log(lib.size)
. Defaults to colSums(y)
.y
. If NULL
will be computed by aveLogCPM(y)
.robust=TRUE
.glmFit
.DGEGLM
object, usually output from glmQLFit
.contrast
is not NULL
.TRUE
then the p-value returned will never be less than would be obtained for a likelihood ratio test with NB dispersion equal to zero.glmQLFit
produces an object of class DGEGLM
with the same components as produced by glmFit
, plus:
df.prior
is a vector of length nrow(y)
if robust=TRUE
, otherwise it has length 1.
var.prior
is a vector of length nrow(y)
if abundance.trend=TRUE
, otherwise it has length 1.glmQFTest
produce an object of class DGELRT
with the same components as produced by glmLRT
, except that the table$LR
column becomes table$F
and contains quasi-likelihood F-statistics.
It also stores df.total
, a numeric vector containing the denominator degrees of freedom for the F-test, equal to df.prior + df.residual.zeros
.
glmQLFit
and glmQLFTest
implement the quasi-likelihood (QL) methods of Lund et al (2012), with some enhancements and with slightly different glm, trend and FDR methods.
See Lun et al (2015) for a tutorial describing the use of glmQLFit
and glmQLFit
as part of a complete analysis pipeline.
Another case study using glmQLFit
and glmQLFTest
is given in Section 4.7 of the edgeR User's Guide.glmQLFit
is similar to glmFit
except that it also estimates QL dispersion values.
It calls the limma function squeezeVar
to conduct empirical Bayes moderation of the genewise QL dispersions.
If robust=TRUE
, then the robust hyperparameter estimation features of squeezeVar
are used (Phipson et al, 2013).
If abundance.trend=TRUE
, then a prior trend is estimated based on the average logCPMs.
glmQLFit
gives special attention to handling of zero counts, and in particular to situations when fitted values of zero provide no useful residual degrees of freedom for estimating the QL dispersion.
The usual residual degrees of freedom are returned as df.residual
while the adjusted residual degrees of freedom are returned as df.residuals.zeros
.
glmQLFTest
is similar to glmLRT
except that it replaces likelihood ratio tests with empirical Bayes quasi-likelihood F-tests.
The p-values from glmQLFTest
are always greater than or equal to those that would be obtained from glmLRT
using the same negative binomial dispersions.
Lund, SP, Nettleton, D, McCarthy, DJ, and Smyth, GK (2012). Detecting differential expression in RNA-sequence data using quasi-likelihood with shrunken dispersion estimates. Statistical Applications in Genetics and Molecular Biology Volume 11, Issue 5, Article 8. http://www.statsci.org/smyth/pubs/QuasiSeqPreprint.pdf
Phipson, B, Lee, S, Majewski, IJ, Alexander, WS, and Smyth, GK (2013). Empirical Bayes in the presence of exceptional cases, with application to microarray data. Bioinformatics Division, Walter and Eliza Hall Institute of Medical Research, Melbourne, Australia. http://www.statsci.org/smyth/pubs/RobustEBayesPreprint.pdf
topTags
displays results from glmQLFTest
.plotQLDisp
can be used to visualize the distribution of QL dispersions after EB shrinkage from glmQLFit
.
The QuasiSeq
package gives an alternative implementation of the Lund et al (2012) methods.
nlibs <- 4
ngenes <- 1000
dispersion.true <- 1/rchisq(ngenes, df=10)
design <- model.matrix(~factor(c(1,1,2,2)))
# Generate count data
y <- rnbinom(ngenes*nlibs,mu=20,size=1/dispersion.true)
y <- matrix(y,ngenes,nlibs)
d <- DGEList(y)
d <- calcNormFactors(d)
# Fit the NB GLMs with QL methods
d <- estimateDisp(d, design)
fit <- glmQLFit(d, design)
results <- glmQLFTest(fit)
topTags(results)
fit <- glmQLFit(d, design, robust=TRUE)
results <- glmQLFTest(fit)
topTags(results)
fit <- glmQLFit(d, design, abundance.trend=FALSE)
results <- glmQLFTest(fit)
topTags(results)
Run the code above in your browser using DataLab