metabin: Meta-analysis of binary outcome data

Description

Calculation of common effect and random effects estimates (risk ratio, odds ratio, risk difference, arcsine difference, or diagnostic odds ratio) for meta-analyses with binary outcome data. Mantel-Haenszel, inverse variance, Peto method, generalised linear mixed model (GLMM), and sample size method are available for pooling. For GLMMs, the rma.glmm function from R package metafor (Viechtbauer, 2010) is called internally.

Usage

metabin(
  event.e,
  n.e,
  event.c,
  n.c,
  studlab,
  data = NULL,
  subset = NULL,
  exclude = NULL,
  cluster = NULL,
  method = ifelse(tau.common, "Inverse", gs("method")),
  sm = ifelse(!is.na(charmatch(tolower(method), c("peto", "glmm", "ssw"), nomatch = NA)),
    "OR", gs("smbin")),
  incr = gs("incr"),
  method.incr = gs("method.incr"),
  allstudies = gs("allstudies"),
  level = gs("level"),
  MH.exact = gs("MH.exact"),
  RR.Cochrane = gs("RR.Cochrane"),
  Q.Cochrane = gs("Q.Cochrane") & method == "MH" & method.tau == "DL",
  model.glmm = gs("model.glmm"),
  common = gs("common"),
  random = gs("random") | !is.null(tau.preset),
  overall = common | random,
  overall.hetstat = common | random,
  prediction = gs("prediction") | !missing(method.predict),
  method.tau = ifelse(!is.na(charmatch(tolower(method), "glmm", nomatch = NA)), "ML",
    gs("method.tau")),
  method.tau.ci = gs("method.tau.ci"),
  tau.preset = NULL,
  TE.tau = NULL,
  tau.common = gs("tau.common"),
  level.ma = gs("level.ma"),
  method.random.ci = gs("method.random.ci"),
  adhoc.hakn.ci = gs("adhoc.hakn.ci"),
  level.predict = gs("level.predict"),
  method.predict = gs("method.predict"),
  adhoc.hakn.pi = gs("adhoc.hakn.pi"),
  seed.predict = NULL,
  method.bias = ifelse(sm == "OR", "Harbord", ifelse(sm == "DOR", "Deeks",
    gs("method.bias"))),
  backtransf = gs("backtransf"),
  pscale = 1,
  text.common = gs("text.common"),
  text.random = gs("text.random"),
  text.predict = gs("text.predict"),
  text.w.common = gs("text.w.common"),
  text.w.random = gs("text.w.random"),
  title = gs("title"),
  complab = gs("complab"),
  outclab = "",
  label.e = gs("label.e"),
  label.c = gs("label.c"),
  label.left = gs("label.left"),
  label.right = gs("label.right"),
  subgroup,
  subgroup.name = NULL,
  print.subgroup.name = gs("print.subgroup.name"),
  sep.subgroup = gs("sep.subgroup"),
  test.subgroup = gs("test.subgroup"),
  prediction.subgroup = gs("prediction.subgroup"),
  seed.predict.subgroup = NULL,
  byvar,
  hakn,
  adhoc.hakn,
  print.CMH = gs("print.CMH"),
  keepdata = gs("keepdata"),
  warn = gs("warn"),
  warn.deprecated = gs("warn.deprecated"),
  control = NULL,
  ...
)

Value

An object of class c("metabin", "meta") with corresponding generic functions (see meta-object).

Arguments

event.e: Number of events in experimental group or true positives in diagnostic study.
n.e: Number of observations in experimental group or number of ill participants in diagnostic study.
event.c: Number of events in control group or false positives in diagnostic study.
n.c: Number of observations in control group or number of healthy participants in diagnostic study.
studlab: An optional vector with study labels.
data: An optional data frame containing the study information, i.e., event.e, n.e, event.c, and n.c.
subset: An optional vector specifying a subset of studies to be used.
exclude: An optional vector specifying studies to exclude from meta-analysis, however, to include in printouts and forest plots.
cluster: An optional vector specifying which estimates come from the same cluster resulting in the use of a three-level meta-analysis model.
method: A character string indicating which method is to be used for pooling of studies. One of "Inverse", "MH", "Peto", "GLMM", or "SSW", can be abbreviated.
sm: A character string indicating which summary measure ("RR", "OR", "RD", "ASD", "DOR", or "VE") is to be used for pooling of studies, see Details.
incr: Could be either a numerical value which is added to cell frequencies for studies with a zero cell count or the character string "TACC" which stands for treatment arm continuity correction, see Details.
method.incr: A character string indicating which continuity correction method should be used ("only0", "if0all", or "all"), see Details.
allstudies: A logical indicating if studies with zero or all events in both groups are to be included in the meta-analysis (applies only if sm is equal to "RR", "OR", or "DOR").
level: The level used to calculate confidence intervals for individual studies.
MH.exact: A logical indicating if incr is not to be added to cell frequencies for studies with a zero cell count to calculate the pooled estimate based on the Mantel-Haenszel method.
RR.Cochrane: A logical indicating if 2*incr instead of 1*incr is to be added to n.e and n.c in the calculation of the risk ratio (i.e., sm="RR") for studies with a zero cell. This is used in RevMan 5, the program for preparing and maintaining Cochrane reviews.
Q.Cochrane: A logical indicating if the Mantel-Haenszel estimate is used in the calculation of the heterogeneity statistic Q which is implemented in RevMan 5, the program for preparing and maintaining Cochrane reviews.
model.glmm: A character string indicating which GLMM should be used. One of "UM.FS", "UM.RS", "CM.EL", and "CM.AL", see Details.
common: A logical indicating whether a common effect meta-analysis should be conducted.
random: A logical indicating whether a random effects meta-analysis should be conducted.
overall: A logical indicating whether overall summaries should be reported. This argument is useful in a meta-analysis with subgroups if overall results should not be reported.
overall.hetstat: A logical value indicating whether to print heterogeneity measures for overall treatment comparisons. This argument is useful in a meta-analysis with subgroups if heterogeneity statistics should only be printed on subgroup level.
prediction: A logical indicating whether a prediction interval should be printed.
method.tau: A character string indicating which method is used to estimate the between-study variance \(\tau^2\) and its square root \(\tau\) (see meta-package).
method.tau.ci: A character string indicating which method is used to estimate the confidence interval of \(\tau^2\) and \(\tau\) (see meta-package).
tau.preset: Prespecified value for the square root of the between-study variance \(\tau^2\).
TE.tau: Overall treatment effect used to estimate the between-study variance tau-squared.
tau.common: A logical indicating whether tau-squared should be the same across subgroups.
level.ma: The level used to calculate confidence intervals for meta-analysis estimates.
method.random.ci: A character string indicating which method is used to calculate confidence interval and test statistic for random effects estimate (see meta-package).
adhoc.hakn.ci: A character string indicating whether an ad hoc variance correction should be applied in the case of an arbitrarily small Hartung-Knapp variance estimate (see meta-package).
level.predict: The level used to calculate prediction interval for a new study.
method.predict: A character string indicating which method is used to calculate a prediction interval (see meta-package).
adhoc.hakn.pi: A character string indicating whether an ad hoc variance correction should be applied for prediction interval (see meta-package).
seed.predict: A numeric value used as seed to calculate bootstrap prediction interval (see meta-package).
method.bias: A character string indicating which test for funnel plot asymmetry is to be used. Either "Begg", "Egger", "Thompson", "Schwarzer", "Harbord", "Peters", or "Deeks", can be abbreviated. See function metabias.
backtransf: A logical indicating whether results for odds ratio (sm="OR"), risk ratio (sm="RR"), or diagnostic odds ratio (sm="DOR") should be back transformed in printouts and plots. If TRUE (default), results will be presented as odds ratios and risk ratios; otherwise log odds ratios and log risk ratios will be shown.
pscale: A numeric defining a scaling factor for printing of risk differences.
text.common: A character string used in printouts and forest plot to label the pooled common effect estimate.
text.random: A character string used in printouts and forest plot to label the pooled random effects estimate.
text.predict: A character string used in printouts and forest plot to label the prediction interval.
text.w.common: A character string used to label weights of common effect model.
text.w.random: A character string used to label weights of random effects model.
title: Title of meta-analysis / systematic review.
complab: Comparison label.
outclab: Outcome label.
label.e: Label for experimental group.
label.c: Label for control group.
label.left: Graph label on left side of forest plot.
label.right: Graph label on right side of forest plot.
subgroup: An optional vector to conduct a meta-analysis with subgroups.
subgroup.name: A character string with a name for the subgroup variable.
print.subgroup.name: A logical indicating whether the name of the subgroup variable should be printed in front of the group labels.
sep.subgroup: A character string defining the separator between name of subgroup variable and subgroup label.
test.subgroup: A logical value indicating whether to print results of test for subgroup differences.
prediction.subgroup: A logical indicating whether prediction intervals should be printed for subgroups.
seed.predict.subgroup: A numeric vector providing seeds to calculate bootstrap prediction intervals within subgroups. Must be of same length as the number of subgroups.
byvar: Deprecated argument (replaced by 'subgroup').
hakn: Deprecated argument (replaced by 'method.random.ci').
adhoc.hakn: Deprecated argument (replaced by 'adhoc.hakn.ci').
print.CMH: A logical indicating whether result of the Cochran-Mantel-Haenszel test for overall effect should be printed.
keepdata: A logical indicating whether original data (set) should be kept in meta object.
warn: A logical indicating whether warnings should be printed (e.g., if incr is added to studies with zero cell frequencies).
warn.deprecated: A logical indicating whether warnings should be printed if deprecated arguments are used.
control: An optional list to control the iterative process to estimate the between-study variance \(\tau^2\). This argument is passed on to rma.uni or rma.glmm, respectively.
...: Additional arguments passed on to rma.glmm function and to catch deprecated arguments.

Author

Guido Schwarzer guido.schwarzer@uniklinik-freiburg.de

Details

Calculation of common and random effects estimates for meta-analyses with binary outcome data.

The following measures of treatment effect are available (Rücker et al., 2009):

Risk ratio (sm = "RR")
Odds ratio (sm = "OR")
Risk difference (sm = "RD")
Arcsine difference (sm = "ASD")
Diagnostic Odds ratio (sm = "DOR")
Vaccine efficacy or vaccine effectiveness (sm = "VE")

Note, mathematically, odds ratios and diagnostic odds ratios are identical, however, the labels in printouts and figures differ. Furthermore, log risk ratio (logRR) and log vaccine ratio (logVR) are mathematical identical, however, back-transformed results differ as vaccine efficacy or effectiveness is defined as VE = 100 * (1 - RR).

A three-level random effects meta-analysis model (Van den Noortgate et al., 2013) is utilized if argument cluster is used and at least one cluster provides more than one estimate. Internally, rma.mv is called to conduct the analysis and weights.rma.mv with argument type = "rowsum" is used to calculate random effects weights.

Default settings are utilised for several arguments (assignments using gs function). These defaults can be changed for the current R session using the settings.meta function.

Furthermore, R function update.meta can be used to rerun a meta-analysis with different settings.

Meta-analysis method

By default, both common effect (also called common effect) and random effects models are considered (see arguments common and random). If method is "MH" (default), the Mantel-Haenszel method (Greenland & Robins, 1985; Robins et al., 1986) is used to calculate the common effect estimate; if method is "Inverse", inverse variance weighting is used for pooling (Fleiss, 1993); if method is "Peto", the Peto method is used for pooling (Yusuf et al., 1985); if method is "SSW", the sample size method is used for pooling (Bakbergenuly et al., 2020).

While the Mantel-Haenszel and Peto method are defined under the common effect model, random effects variants based on these methods are also implemented in metabin. Following RevMan 5, the Mantel-Haenszel estimator is used in the calculation of the between-study heterogeneity statistic Q which is used in the DerSimonian-Laird estimator (DerSimonian and Laird, 1986). Accordingly, the results for the random effects meta-analysis using the Mantel-Haenszel or inverse variance method are typically very similar. For the Peto method, Peto's log odds ratio, i.e. (O-E) / V and its standard error sqrt(1 / V) with O-E and V denoting "Observed minus Expected" and its variance, are utilised in the random effects model. Accordingly, results of a random effects model using sm = "Peto" can be different to results from a random effects model using sm = "MH" or sm = "Inverse".

A distinctive and frequently overlooked advantage of binary endpoints is that individual patient data (IPD) can be extracted from a two-by-two table. Accordingly, statistical methods for IPD, i.e., logistic regression and generalised linear mixed models, can be utilised in a meta-analysis of binary outcomes (Stijnen et al., 2010; Simmonds et al., 2016). These methods are available (argument method = "GLMM") for the odds ratio as summary measure by calling the rma.glmm function from R package metafor internally.

Four different GLMMs are available for meta-analysis with binary outcomes using argument model.glmm (which corresponds to argument model in the rma.glmm function):

1.	Logistic regression model with common study effects (default)
	(`model.glmm = "UM.FS"`, i.e., Unconditional Model - Fixed Study effects)
2.	Mixed-effects logistic regression model with random study effects
	(`model.glmm = "UM.RS"`, i.e., Unconditional Model - Random Study effects)
3.	Generalised linear mixed model (conditional Hypergeometric-Normal)
	(`model.glmm = "CM.EL"`, i.e., Conditional Model - Exact Likelihood)
4.	Generalised linear mixed model (conditional Binomial-Normal)
	(`model.glmm = "CM.AL"`, i.e., Conditional Model - Approximate Likelihood)

Details on these four GLMMs as well as additional arguments which can be provided using argument '...' in metabin are described in rma.glmm where you can also find information on the iterative algorithms used for estimation. Note, regardless of which value is used for argument model.glmm, results for two different GLMMs are calculated: common effect model (with fixed treatment effect) and random effects model (with random treatment effects).

Continuity correction

Three approaches are available to apply a continuity correction:

Only studies with a zero cell count (method.incr = "only0")
All studies if at least one study has a zero cell count (method.incr = "if0all")
All studies irrespective of zero cell counts (method.incr = "all")

By default, a continuity correction is only applied to studies with a zero cell count (method.incr = "only0"). This method showed the best performance for the odds ratio in a simulation study under the random effects model (Weber et al., 2020).

The continuity correction method is used both to calculate individual study results with confidence limits and to conduct meta-analysis based on the inverse variance method. For the risk difference, the method is only considered to calculate standard errors and confidence limits. For Peto method and GLMMs no continuity correction is used in the meta-analysis. Furthermore, the continuity correction is ignored for individual studies for the Peto method.

For studies with a zero cell count, by default, 0.5 (argument incr) is added to all cell frequencies for the odds ratio or only the number of events for the risk ratio (argument RR.Cochrane = FALSE, default). The increment is added to all cell frequencies for the risk ratio if argument RR.Cochrane = TRUE. For the risk difference, incr is only added to all cell frequencies to calculate the standard error. Finally, a treatment arm continuity correction is used if incr = "TACC" (Sweeting et al., 2004; Diamond et al., 2007).

For odds ratio and risk ratio, treatment estimates and standard errors are only calculated for studies with zero or all events in both groups if allstudies = TRUE.

For the Mantel-Haenszel method, by default (if MH.exact is FALSE), incr is added to cell frequencies of a study with a zero cell count in the calculation of the pooled risk ratio or odds ratio as well as the estimation of the variance of the pooled risk difference, risk ratio or odds ratio. This approach is also used in other software, e.g. RevMan 5 and the Stata procedure metan. According to Fleiss (in Cooper & Hedges, 1994), there is no need to add 0.5 to a cell frequency of zero to calculate the Mantel-Haenszel estimate and he advocates the exact method (MH.exact = TRUE). Note, estimates based on exact Mantel-Haenszel method or GLMM are not defined if the number of events is zero in all studies either in the experimental or control group.

Subgroup analysis

Argument subgroup can be used to conduct subgroup analysis for a categorical covariate. The metareg function can be used instead for more than one categorical covariate or continuous covariates.

Exclusion of studies from meta-analysis

Arguments subset and exclude can be used to exclude studies from the meta-analysis. Studies are removed completely from the meta-analysis using argument subset, while excluded studies are shown in printouts and forest plots using argument exclude (see Examples in metagen). Meta-analysis results are the same for both arguments.

Presentation of meta-analysis results

Internally, both common effect and random effects models are calculated regardless of values choosen for arguments common and random. Accordingly, the estimate for the random effects model can be extracted from component TE.random of an object of class "meta" even if argument random = FALSE. However, all functions in R package meta will adequately consider the values for common and random. E.g. function print.meta will not print results for the random effects model if random = FALSE.

A prediction interval will only be shown if prediction = TRUE.

References

Bakbergenuly I, Hoaglin DC, Kulinskaya E (2020): Methods for estimating between-study variance and overall effect in meta-analysis of odds-ratios. Research Synthesis Methods, 11, 426--42

Cooper H & Hedges LV (1994): The Handbook of Research Synthesis. Newbury Park, CA: Russell Sage Foundation

Diamond GA, Bax L, Kaul S (2007): Uncertain Effects of Rosiglitazone on the Risk for Myocardial Infarction and Cardiovascular Death. Annals of Internal Medicine, 147, 578--81

DerSimonian R & Laird N (1986): Meta-analysis in clinical trials. Controlled Clinical Trials, 7, 177--88

Fleiss JL (1993): The statistical basis of meta-analysis. Statistical Methods in Medical Research, 2, 121--45

Greenland S & Robins JM (1985): Estimation of a common effect parameter from sparse follow-up data. Biometrics, 41, 55--68

Review Manager (RevMan) [Computer program]. Version 5.4. The Cochrane Collaboration, 2020

Robins J, Breslow N, Greenland S (1986): Estimators of the Mantel-Haenszel Variance Consistent in Both Sparse Data and Large-Strata Limiting Models. Biometrics, 42, 311--23

Rücker G, Schwarzer G, Carpenter J, Olkin I (2009): Why add anything to nothing? The arcsine difference as a measure of treatment effect in meta-analysis with zero cells. Statistics in Medicine, 28, 721--38

Simmonds MC, Higgins JP (2016): A general framework for the use of logistic regression models in meta-analysis. Statistical Methods in Medical Research, 25, 2858--77

StataCorp. 2011. Stata Statistical Software: Release 12. College Station, TX: StataCorp LP.

Stijnen T, Hamza TH, Ozdemir P (2010): Random effects meta-analysis of event outcome in the framework of the generalized linear mixed model with applications in sparse data. Statistics in Medicine, 29, 3046--67

Sweeting MJ, Sutton AJ, Lambert PC (2004): What to add to nothing? Use and avoidance of continuity corrections in meta-analysis of sparse data. Statistics in Medicine, 23, 1351--75

Van den Noortgate W, López-López JA, Marín-Martínez F, Sánchez-Meca J (2013): Three-level meta-analysis of dependent effect sizes. Behavior Research Methods, 45, 576--94

Viechtbauer W (2010): Conducting meta-analyses in R with the metafor package. Journal of Statistical Software, 36, 1--48

Weber F, Knapp G, Ickstadt K, Kundt G, Glass Ä (2020): Zero-cell corrections in random-effects meta-analyses. Research Synthesis Methods, 11, 913--9

Yusuf S, Peto R, Lewis J, Collins R, Sleight P (1985): Beta blockade during and after myocardial infarction: An overview of the randomized trials. Progress in Cardiovascular Diseases, 27, 335--71

Examples

Run this code

# Calculate odds ratio and confidence interval for a single study
#
metabin(10, 20, 15, 20, sm = "OR")

# Different results (due to handling of studies with double zeros)
#
metabin(0, 10, 0, 10, sm = "OR")
metabin(0, 10, 0, 10, sm = "OR", allstudies = TRUE)

# Use subset of Olkin (1995) to conduct meta-analysis based on
# inverse variance method (with risk ratio as summary measure)
#
data(Olkin1995)
m1 <- metabin(ev.exp, n.exp, ev.cont, n.cont,
  data = Olkin1995, subset = c(41, 47, 51, 59),
  studlab = paste(author, year),
  method = "Inverse")
m1
# Show results for individual studies
summary(m1)

# Use different subset of Olkin (1995)
#
m2 <- metabin(ev.exp, n.exp, ev.cont, n.cont,
  data = Olkin1995, subset = year < 1970,
  studlab = paste(author, year),
  method = "Inverse")
m2
forest(m2)

# Meta-analysis with odds ratio as summary measure
#
m3 <- metabin(ev.exp, n.exp, ev.cont, n.cont,
  data = Olkin1995, subset = year < 1970,
  studlab = paste(author, year),
  sm = "OR", method = "Inverse")
# Same meta-analysis result using 'update.meta' function
m3 <- update(m2, sm = "OR")
m3

# Meta-analysis based on Mantel-Haenszel method (with odds ratio as
# summary measure)
#
m4 <- update(m3, method = "MH")
m4

# Meta-analysis based on Peto method (only available for odds ratio
# as summary measure)
#
m5 <- update(m3, method = "Peto")
m5

if (FALSE) {
# Meta-analysis using generalised linear mixed models
# (only if R package 'lme4' is available)
#

# Logistic regression model with (k = 4) fixed study effects
# (default: model.glmm = "UM.FS")
#
m6 <- metabin(ev.exp, n.exp, ev.cont, n.cont,
  studlab = paste(author, year),
  data = Olkin1995, subset = year < 1970, method = "GLMM")
# Same results:
m6 <- update(m2, method = "GLMM")
m6

# Mixed-effects logistic regression model with random study effects
# (warning message printed due to argument 'nAGQ')
#
m7 <- update(m6, model.glmm = "UM.RS")
#
# Use additional argument 'nAGQ' for internal call of 'rma.glmm'
# function
#
m7 <- update(m6, model.glmm = "UM.RS", nAGQ = 1)
m7

# Generalised linear mixed model (conditional Hypergeometric-Normal)
# (R package 'BiasedUrn' must be available)
#
m8 <- update(m6, model.glmm = "CM.EL")
m8

# Generalised linear mixed model (conditional Binomial-Normal)
#
m9 <- update(m6, model.glmm = "CM.AL")
m9

# Logistic regression model with (k = 70) fixed study effects
# (about 18 seconds with Intel Core i7-3667U, 2.0GHz)
#
m10 <- metabin(ev.exp, n.exp, ev.cont, n.cont,
   studlab = paste(author, year),
   data = Olkin1995, method = "GLMM")
m10

# Mixed-effects logistic regression model with random study effects
# - about 50 seconds with Intel Core i7-3667U, 2.0GHz
# - several warning messages, e.g. "failure to converge, ..."
#
update(m10, model.glmm = "UM.RS")

# Conditional Hypergeometric-Normal GLMM
# - long computation time (about 12 minutes with Intel Core
#   i7-3667U, 2.0GHz)
# - estimation problems for this very large dataset:
#   * warning that Choleski factorization of Hessian failed
#   * confidence interval for treatment effect smaller in random
#     effects model compared to common effect model
#
system.time(m11 <- update(m10, model.glmm = "CM.EL"))
m11

# Generalised linear mixed model (conditional Binomial-Normal)
# (less than 1 second with Intel Core i7-3667U, 2.0GHz)
#
update(m10, model.glmm = "CM.AL")
}

Run the code above in your browser using DataLab