## S3 method for class 'DESeqDataSet':
estimateDispersions(object, fitType = c("parametric",
"local", "mean"), maxit = 100, quiet = FALSE, modelMatrix = NULL)
asymptDisp
andextraPois
are given in the attributecoefficients
of thedispersionFunction
of the object.dispersionFunction
). The points
are weighted by normalized mean count in the local regression.design(object)
mcols
, or the final dispersions
accessible via dispersions
.dds <- estimateDispersions(dds)
The fitting proceeds as follows: for each gene, an estimate of the dispersion
is found which maximizes the Cox Reid-adjusted profile likelihood
(the methods of Cox Reid-adjusted profile likelihood maximization for
estimation of dispersion in RNA-Seq data were developed by McCarthy,
et al. (2012), first implemented in the edgeR package in 2010);
a trend line capturing the dispersion-mean relationship is fit to the maximum likelihood estimates;
a normal prior is determined for the log dispersion estimates centered
on the predicted value from the trended fit
with variance equal to the difference between the observed variance of the
log dispersion estimates and the expected sampling variance;
finally maximum a posteriori dispersion estimates are returned.
This final dispersion parameter is used in subsequent tests.
The final dispersion estimates can be accessed from an object using dispersions
.
The fitted dispersion-mean relationship is also used in
varianceStabilizingTransformation
.
All of the intermediate values (gene-wise dispersion estimates, fitted dispersion
estimates from the trended fit, etc.) are stored in mcols(dds)
, with
information about these columns in mcols(mcols(dds))
.
The log normal prior on the dispersion parameter has been proposed by Wu, et al. (2012) and is also implemented in the DSS package.
In DESeq2, the dispersion estimation procedure described above replaces the different methods of dispersion from the previous version of the DESeq package.
estimateDispersions
checks for the case of an analysis
with as many samples as the number of coefficients to fit,
and will temporarily substitute a design formula ~ 1
for the
purposes of dispersion estimation. This treats the samples as
replicates for the purpose of dispersion estimation. As mentioned in the DESeq paper:
"While one may not want to draw strong conclusions from such an analysis,
it may still be useful for exploration and hypothesis generation."
The lower-level functions called by estimateDispersions
are:
estimateDispersionsGeneEst
,
estimateDispersionsFit
, and
estimateDispersionsMAP
.
dds <- makeExampleDESeqDataSet()
dds <- estimateSizeFactors(dds)
dds <- estimateDispersions(dds)
head(dispersions(dds))
Run the code above in your browser using DataLab