dispCoxReid(y, design=NULL, offset=NULL, weights=NULL, AveLogCPM=NULL, interval=c(0,4),
tol=1e-5, min.row.sum=5, subset=10000)
dispDeviance(y, design=NULL, offset=NULL, interval=c(0,4), tol=1e-5, min.row.sum=5,
subset=10000, AveLogCPM=NULL, robust=FALSE, trace=FALSE)
dispPearson(y, design=NULL, offset=NULL, min.row.sum=5, subset=10000,
AveLogCPM=NULL, tol=1e-6, trace=FALSE, initial.dispersion=0.1)
glmFit
.glmFit
. Defaults to log(colSums(y))
.optimize
.optimize
or uniroot
.estimateGLMCommonDisp
.dispCoxReid
maximizes the Cox-Reid adjusted profile likelihood (Cox and Reid, 1987).
dispPearson
sets the average Pearson goodness of fit statistics to its (asymptotic) expected value.
This is also known as the pseudo-likelihood estimator.
dispDeviance
sets the average residual deviance statistic to its (asymptotic) expected values.
This is also known as the quasi-likelihood estimator.
Robinson and Smyth (2008) and McCarthy et al (2011) showed that the Pearson (pseudo-likelihood) estimator typically under-estimates the true dispersion.
It can be seriously biased when the number of libraries (ncol(y)
is small.
On the other hand, the deviance (quasi-likelihood) estimator typically over-estimates the true dispersion when the number of libraries is small.
Robinson and Smyth (2008) and McCarthy et al (2011) showed the Cox-Reid estimator to be the least biased of the three options.
dispCoxReid
uses optimize
to maximize the adjusted profile likelihood.
dispDeviance
uses uniroot
to solve the estimating equation.
The robust options use an order statistic instead the mean statistic, and have the effect that a minority of genes with very large (outlier) dispersions should have limited influence on the estimated value.
dispPearson
uses a globally convergent Newton iteration.
Robinson MD and Smyth GK (2008). Small-sample estimation of negative binomial dispersion, with applications to SAGE data. Biostatistics, 9, 321-332
McCarthy, DJ, Chen, Y, Smyth, GK (2012). Differential expression analysis of multifactor RNA-Seq experiments with respect to biological variation.
Nucleic Acids Research.
estimateGLMCommonDisp
, optimize
, uniroot
ngenes <- 100
nlibs <- 4
y <- matrix(rnbinom(ngenes*nlibs,mu=10,size=10),nrow=ngenes,ncol=nlibs)
group <- factor(c(1,1,2,2))
lib.size <- rowSums(y)
design <- model.matrix(~group)
disp <- dispCoxReid(y, design, offset=log(lib.size), subset=100)
Run the code above in your browser using DataLab