mglmOneGroup(y, dispersion=0, offset=0, weights=NULL, maxit=50, tol=1e-10,
verbose=FALSE, coef.start=NULL)
mglmOneWay(y, design=NULL, dispersion=0, offset=0, weights=NULL, maxit=50,
tol=1e-10, coef.start=NULL)
mglmLevenberg(y, design, dispersion=0, offset=0, weights=NULL,
coef.start=NULL, start.method="null", maxit=200, tol=1e-06)
designAsFactor(design)
y
.y
.y
and number of columns should agree with design
.coef.stat=NULL
. Possible values are "null"
to start from the null model of equal expression levels or "y"
to use the data as starting value for the mean.mglmOneGroup
, convergence is judged successful when the step size falls below tol
in absolute size.TRUE
, warnings will be issued when maxit
iterations are exceeded before convergence is achieved.mglmOneGroup
produces a vector of length equal to the number of genes (number of rows of y
) providing the single coefficent from the GLM fit for each gene. This can be interpreted as a measure of the 'average expression' level of the gene.mglmLevenberg
produces a list with the following components:
deviances.function
returns a function to calculate the deviance as appropriate for the given values of the dispersion.designAsFactor
returns a factor of length equal to nrow(design)
.
mglmOneGroup
, mglmOneWay
and mglmLevenberg
all fit negative binomial generalized linear models, with the same design matrix but possibly different dispersions, offsets and weights, to a series of response vectors.
The functions are all low-level functions in that they operate on atomic objects such as matrices.
They are used as work-horses by higher-level functions in the edgeR package, especially by glmFit
.mglmOneGroup
fit the null model, with intercept term only, to each response vector.
In other words, it treats the libraries as belonging to one group.
It implements Fisher scoring with a score-statistic stopping criterion for each gene.
Excellent starting values are available for the null model, so this function seldom has any problems with convergence.
It is used by other edgeR functions to compute the overall abundance for each gene.
mglmLevenberg
fits an arbitrary log-linear model to each response vector.
It implements a Levenberg-Marquardt modification of the glm scoring algorithm to prevent divergence.
The main computation is implemented in C++.
All these functions treat the dispersion parameter of the negative binomial distribution as a known input.
deviances.function
chooses the appropriate deviance function to use given a scalar or vector of dispersion parameters.
If the dispersion values are zero, then the Poisson deviance function is returned; if the dispersion values are positive, then the negative binomial deviance function is returned.
glmFit
, for more object-orientated GLM modelling for DGE data.y <- matrix(rnbinom(1000,mu=10,size=2),ncol=4)
lib.size <- colSums(y)
dispersion <- 0.1
abundance <- mglmOneGroup(y, dispersion=dispersion, offset=log(lib.size))
AveLogCPM <- log1p(exp(1e6*abundance))/log(2)
summary(AveLogCPM)
## Same as above:
AveLogCPM <- aveLogCPM(y, dispersion, offset=log(lib.size))
## Fit the NB GLM to the counts with a given design matrix
f1 <- factor(c(1,1,2,2))
f2 <- factor(c(1,2,1,2))
x <- model.matrix(~f1+f2)
fit <- mglmLevenberg(y, x, dispersion=dispersion, offset=log(lib.size))
head(fit$coefficients)
Run the code above in your browser using DataLab