rma.glmm(ai, bi, ci, di, n1i, n2i, x1i, x2i, t1i, t2i, xi, mi, ti, ni,
mods, measure, intercept=TRUE, data, slab, subset,
add=1/2, to="only0", drop00=TRUE, vtype="LS",
model="UM.FS", method="ML", tdist=FALSE,
level=95, digits=4, btt, nAGQ=7, verbose=FALSE, control)
escalc
function for more details.escalc
function for more details.escalc
function for more details.escalc
function for more details.escalc
function for more details.escalc
function for more details.escalc
function for more details.escalc
function for more details.escalc
function for more details.escalc
function for more details.escalc
function for more details.escalc
function for more details.escalc
function for more details.escalc
function for more details."OR"
), the incidence rate ratio ("IRR"
), the logit transformed proportion ("PLO"
), or the log transfoTRUE
).escalc
function for more details.add
should be added (either "only0"
, "all"
, "if0all"
, or "none"
). See below and the documentation of the
escalc
function for more details.escalc
function for more details."UM.FS"
(the default), "UM.RS"
, "CM.EL"
, or "CM.AL"
). See method="FE"
. Random/mixed-effects models are fitted by setting method
FALSE
, the default) or the t-distribution (when TRUE
). See FALSE
). Can also be an integer. Values > 1 generate more verbose output. See measure="OR"
for odds ratios (analyzed in log units)measure="IRR"
for incidence rate ratios (analyzed in log units)measure="PLO"
for logit transformed proportions (i.e., log odds)measure="IRLN"
for log transformed incidence rates.escalc
function describes the data/arguments that should be specified/used for these measures.
Specifying the Model
A variety of model types are available when analyzing $2 \times 2$ table data (i.e., when measure="OR"
) or two-group event count data (i.e., when measure="IRR"
):
model="UM.FS"
for an unconditional generalized linear mixed-effects model with fixed study effectsmodel="UM.RS"
for an unconditional generalized linear mixed-effects model with random study effectsmodel="CM.AL"
for a conditional generalized linear mixed-effects model (approximate likelihood)model="CM.EL"
for a conditional generalized linear mixed-effects model (exact likelihood).measure="OR"
, models "UM.FS"
and "UM.RS"
are essentially (mixed-effects) logistic regression models, while for measure="IRR"
, these models are (mixed-effects) Poisson regression models. A choice must be made on how to model study level variability (i.e., differences in outcomes across studies irrespective of group membership). One can choose between using fixed study effects (which means that $k$ dummy variables are added to the model) or random study effects (which means that random effects corresponding to the study factor are added to the model).
The conditional model (model="CM.EL"
) avoids having to model study level variability by conditioning on the total numbers of cases/events in each study. For measure="OR"
, this leads to a non-central hypergeometric distribution for the data within each study and the corresponding model is then a (mixed-effects) conditional logistic model. Fitting this model can be difficult and computationally expensive. When the number of cases in each study is small relative to the group sizes, one can approximate the exact likelihood by a binomial distribution, which leads to a regular (mixed-effects) logistic regression model (model="CM.AL"
). For measure="IRR"
, the conditional model leads directly to a binomial distribution for the data within each study and the resulting model is again a (mixed-effects) logistic regression model (no approximate likelihood model is needed here).
When analyzing proportions (i.e., measure="PLO"
) or incidence rates (i.e., measure="IRLN"
) of individual groups, the model type is always a (mixed-effects) logistic or Poisson regression model, respectively (i.e., the model
argument is not relevant here).
Aside from choosing the general model type, one has to decide whether to fit a fixed- or random-effects model to the data. A fixed-effects model is fitted by setting method="FE"
. A random-effects model is fitted by setting method="ML"
(the default). Note that random-effects models with dichotomous data are often referred to as mods
argument. A single moderator can be given as a (row or column) vector of length $k$ specifying the values of the moderator. Multiple moderators are specified by giving an appropriate model matrix (i.e., $\mathbf{X}$) with $k$ rows and as many columns as there are moderator variables (e.g., mods = cbind(mod1, mod2, mod3)
, where mod1
, mod2
, and mod3
correspond to the names of the variables for three moderator variables). The intercept is added to the model matrix by default unless intercept=FALSE
.
Alternatively, one can use the standard formula
syntax to specify the model. In this case, the mods
argument should be set equal to a one-sided formula of the form mods = ~ model
(e.g., mods = ~ mod1 + mod2 + mod3
). Interactions, polynomial terms, and factors can be easily added to the model in this manner. When specifying a model formula via the mods
argument, the intercept
argument is ignored. Instead, the inclusion/exclusion of the intercept term is controlled by the specified formula (e.g., mods = ~ mod1 + mod2 + mod3 - 1
would lead to the removal of the intercept term). With moderators, a fixed-effects with moderators model is then fitted by setting method="FE"
, while a mixed-effects model is fitted by setting method="ML"
.
Fixed-, Saturated-, and Random/Mixed-Effects Models
When fitting a particular model, actually up to three different models are fitted within the function:
method="ML"
).method="ML"
, the fixed-effects and saturated models are fitted, as they are used to compute the test statistics for the Wald-type and likelihood ratio tests for (residual) heterogeneity (see below).
Omnibus Test of Parameters
For models including moderators, an omnibus test of all the model coefficients is conducted that excludes the intercept (the first coefficient) if it is included in the model. If no intercept is included in the model, then the omnibus test includes all of the coefficients in the model including the first. Alternatively, one can manually specify the indices of the coefficients to test via the btt
argument. For example, with btt=c(3,4)
, only the third and fourth coefficient from the model would be included in the test (if an intercept is included in the model, then it corresponds to the first coefficient in the model).
Categorical Moderators
Categorical moderator variables can be included in the model via the mods
argument in the same way that appropriately (dummy) coded categorical independent variables can be included in linear models. One can either do the dummy coding manually or use a model formula together with the factor
function to let Rhandle the coding automatically.
Tests and Confidence Intervals
By default, the test statistics of the individual coefficients in the model (and the corresponding confidence intervals) are based on the normal distribution, while the omnibus test is based on a chi-square distribution with $m$ degrees of freedom ($m$ being the number of coefficients tested). As an alternative, one can set tdist=TRUE
, which slightly mimics the Knapp and Hartung (2003) method by using a t-distribution with $k-p$ degrees of freedom for tests of individual coefficients and confidence intervals and an F-distribution with $m$ and $k-p$ degrees of freedom ($p$ being the total number of model coefficients including the intercept if it is present) for the omnibus test statistic.
Tests for (Residual) Heterogeneity
Two different tests for (residual) heterogeneity are automatically carried out by the function. The first is a Wald-type test, which tests the coefficients corresponding to the dummy variables added in the saturated model for significance. The second is a likelihood ratio test, which tests the same set of coefficients, but does so by computing $-2$ times the difference in the log-likelihood of the fixed-effects and the saturated model. These two tests are not identical for the types of models fitted by the rma.glmm
function and may even lead to conflicting conclusions.
Individual Outcomes
The various models do not require the calculation of the individual outcome values and directly make use of the table/event counts. Zero cells/events are not a problem (except in extreme cases, such as when one of the two outcomes never occurs or when there are no events in any of the studies). Therefore, it is unnecessary to add some constant to the cell counts (or the number of events) when there are zero cells/events. However, for plotting and various other functions, it is necessary to calculate the individual outcome values for the $k$ studies. Here, zero cells/events can be problematic, so adding a constant value to the cell counts (or the number of events) ensures that all $k$ values can be calculated. The add
and to
arguments are used to specify what value should be added to the cell frequencies (or the number of events) and under what circumstances when calculating the individual outcome values. The documentation of the escalc
function explains how the add
and to
arguments work. Note that drop00
is set to TRUE
by default, since studies where ai=ci=0
or bi=di=0
or studies where x1i=x2i=0
are uninformative about the size of the effect.c("rma.glmm","rma")
. The object is a list containing the following components:
0
when method="FE"
.}
model="UM.RS"
).}
print.rma.glmm
function. If fit statistics should also be given, use summary.rma
(or use the fitstats.rma
function to extract them).model="UM.FS"
andmodel="CM.AL"
, iteratively reweighted least squares (IWLS) as implemented in theglm
function is used for fitting the fixed-effects and the saturated models. Formethod="ML"
, adaptive Gauss-Hermite quadrature as implemented in theglmer
function is used. The same applies whenmodel="CM.EL"
is used in combination withmeasure="IRR"
or whenmeasure="PLO"
ormeasure="IRLN"
(regardless of the model type).model="UM.RS"
, adaptive Gauss-Hermite quadrature as implemented in theglmer
function is used to fit all of the models.model="CM.EL"
andmeasure="OR"
, the quasi-Newton method ("BFGS"
) as implemented in theoptim
function is used by default for fitting the fixed-effects and the saturated models. Formethod="ML"
, the same algorithm is used, together with adaptive quadrature as implemented in theintegrate
function (for the integration over the density of the non-central hypergeometric distribution).model="CM.EL"
and measure="OR"
, actually model="CM.AL"
is used first to obtain starting values for optim
, so either 4 (if method="FE"
) or 6 (if method="ML"
) models need to be fitted in total.
Various control parameters can be adjusted via the control
argument:
optimizer
is set by default to"optim"
, but can be set to"nlminb"
or one of the optimizers from the"bobyqa"
,"newuoa"
, or"uobyqa"
),optmethod
is used to set themethod
argument foroptim
(default is"BFGS"
),optCtrl
is a list of named arguments to be passed on to thecontrol
argument of the chosen optimizer,glmCtrl
is a list of named arguments to be passed on to thecontrol
argument of theglm
function,glmerCtrl
is a list of named arguments to be passed on to thecontrol
argument of theglmer
function, andintCtrl
is a list of named arguments (i.e.,rel.tol
andsubdivisions
) to be passed on to theintegrate
function.glmer
, the nAGQ
argument is used to specify the number of quadrature points. The default value is 7, which should provide sufficient accuracy in the evaluation of the log-likelihood in most cases, but at the expense of speed. Setting this to 1 corresponds to the Laplacian approximation (which is faster, but less accurate).
Information on the progress of the various algorithms is obtained by setting verbose=TRUE
or with control=list(verbose=TRUE)
. Since fitting the various models can be computationally expensive, this option is quite useful to determine how the model fitting is progressing.
For model="CM.EL"
and measure="OR"
, optimization involves repeated calculation of the density of the non-central hypergeometric distribution. When method="ML"
, this also requires integration over the same density. This is currently implemented in a rather brute-force manner and may not be numerically stable, especially when models with moderators are fitted. Stability can be improved by scaling the moderators in a similar manner (i.e., don't use a moderator that is coded 0 and 1, while another uses values in the 1000s). For models with an intercept and moderators, the function actually rescales (non-dummy) variables to z-scores during the model fitting (results are given after back-scaling, so this should be transparent to the user). For models without an intercept, this is not done, so sensitivity analyses are highly recommended here (to ensure that the results do not depend on the scaling of the moderators).rma.uni
, rma.mh
, rma.peto
, and rma.mv
for other model fitting functions.
dat.nielweise2007
, dat.nielweise2008
, and dat.collins1985a
for further examples of the use of the rma.glmm
function.