escalc(measure, formula, ...)
## S3 method for class 'default':
escalc(measure, formula, ai, bi, ci, di, n1i, n2i, x1i, x2i, t1i, t2i,
m1i, m2i, sd1i, sd2i, xi, mi, ri, ti, sdi, ni, data, slab, subset,
add=1/2, to="only0", drop00=FALSE, vtype="LS",
var.names=c("yi","vi"), append=TRUE, replace=TRUE, digits=4, ...)
## S3 method for class 'formula':
escalc(measure, formula, weights, data,
add=1/2, to="only0", drop00=FALSE, vtype="LS",
var.names=c("yi","vi"), digits=4, ...)
add
should be added (either "all"
, "only0"
, "if0all"
, or "none"
). See "LS"
, "UB"
, "ST"
, or vtype="CS"
). See "yi"
and "vi"
).data
argument (if one has been specified) should be returned together with the observed outcomes and corresponding sampling variances (default is TRUE
).yi
and vi
in the data frame should be replaced or not. Only relevant when append=TRUE
and the data frame already contains the yi
and vi
variablc("escalc","data.frame")
. The object is a data frame containing the following components:append=TRUE
and a data frame was specified via the data
argument, then yi
and vi
are append to this data frame. Note that the var.names
argument actually specifies the names of these two variables.
If the data frame already contains two variables with names as specified by the var.names
argument, the values for these two variables will be overwritten when replace=TRUE
(which is the default). By setting replace=FALSE
, only values that are NA
will be replaced.
The object is formated and printed with the print.escalc
function. The summary.escalc
function can be used to obtain confidence intervals for the individual outcomes.escalc
function.
The measure
argument is a character string specifying which outcome measure should be calculated (see below for the various options), arguments ai
through ni
are then used to specify the information needed to calculate the various measures (depending on the chosen outcome measure, different arguments need to be specified), and data
can be used to specify a data frame containing the variables given to the previous arguments. The add
, to
, and drop00
arguments may be needed when dealing with frequency or count data that may need special handling when some of the frequencies or counts are equal to zero (see below for details). Finally, the vtype
argument is used to specify how to estimate the sampling variances (again, see below for details).
To provide a structure to the various effect size or outcome measures that can be calculated with the escalc
function, we can distinguish between measures that are used to:
ai
bi
n1i
group 2 ci
di
n2i
} where ai
, bi
, ci
, and di
denote the cell frequencies (i.e., the number of people falling into a particular category) and n1i
and n2i
the row totals (i.e., the group sizes).
For example, in a set of randomized clinical trials, group 1 and group 2 may refer to the treatment and placebo/control group, respectively, with outcome 1 denoting some event of interest (e.g., death, complications, failure to improve under the treatment) and outcome 2 its complement. Similarly, in a set of cohort studies, group 1 and group 2 may denote those who engage in and those who do not engage in a potentially harmful behavior (e.g., smoking), with outcome 1 denoting the development of a particular disease (e.g., lung cancer) during the follow-up period. Finally, in a set of case-control studies, group 1 and group 2 may refer to those with the disease (i.e., cases) and those free of the disease (i.e., controls), with outcome 1 denoting, for example, exposure to some risk environmental risk factor and outcome 2 non-exposure. Note that in all of these examples, the stratified sampling scheme fixes the row totals (i.e., the group sizes) by design.
A meta-analysis of studies reporting results in terms of $2 \times 2$ tables can be based on one of several different outcome measures, including the relative risk (risk ratio), the odds ratio, the risk difference, and the arcsine transformed risk difference (e.g., Fleiss & Berlin, 2009, Ruecker et al., 2009). For any of these outcome measures, one needs to specify the cell frequencies via the ai
, bi
, ci
, and di
arguments (or alternatively, one can use the ai
, ci
, n1i
, and n2i
arguments).
The options for the measure
argument are then:
"RR"
for thelog relative risk."OR"
for thelog odds ratio."RD"
for therisk difference."AS"
for thearcsine transformed risk difference(Ruecker et al., 2009)."PETO"
for thelog odds ratioestimated with Peto's method (Yusuf et al., 1985).to="only0"
(the default), the value of add
(the default is 1/2) is added to each cell of those $2 \times 2$ tables with at least one cell equal to 0. When to="all"
, the value of add
is added to each cell of all $2 \times 2$ tables. When to="if0all"
, the value of add
is added to each cell of all $2 \times 2$ tables, but only when there is at least one $2 \times 2$ table with a zero cell. Setting to="none"
or add=0
has the same effect: No adjustment to the observed table frequencies is made. Depending on the outcome measure and the data, this may lead to division by zero inside of the function (when this occurs, the resulting value is recoded to NA
). Also, studies where ai=ci=0
or bi=di=0
may be considered to be uninformative about the size of the effect and dropping such studies has sometimes been recommended (Higgins & Green, 2008). This can be done by setting drop00=TRUE
. The counts for such studies will then be set to NA
.
A dataset corresponding to data of this type is provided in dat.bcg
.
Assuming that the dichotomous outcome is actually a dichotomized version of the responses on an underlying quantitative scale, it is also possible to estimate the standardized mean difference based on $2 \times 2$ table data, using either the probit transformed risk difference or a transformation of the odds ratio (e.g., Chinn, 2000; Hasselblad & Hedges, 1995; Sanchez-Meca et al., 2003). The options for the measure
argument are then:
"PBIT"
for theprobit transformed risk differenceas an estimate of the standardized mean difference."OR2D"
fortransformed odds ratioas an estimate of the standardized mean difference.x1i
t1i
group 2 x2i
t2i
} where x1i
and x2i
denote the total number of events in the first and the second group, respectively, and t1i
and t2i
the corresponding total person-times at risk. Often, the person-time is measured in years, so that t1i
and t2i
denote the total number of follow-up years in the two groups.
Note that this form of data is fundamentally different from that described in the previous section, since the total follow-up time may differ even for groups of the same size and the individuals studied may experience the event of interest multiple times. Hence, different outcome measures than the ones described in the previous section must be considered when data are reported in this format. These inlude the incidence rate ratio, the incidence rate difference, and the square-root transformed incidence rate difference (Bagos & Nikolopoulos, 2009; Rothman et al., 2008). For any of these outcome measures, one needs to specify the total number of events via the x1i
and x2i
arguments and the corresponding total person-times via the t1i
and t2i
arguments.
The options for the measure
argument are then:
"IRR"
for thelog incidence rate ratio."IRD"
for theincidence rate difference."IRSD"
for thesquare-root transformed incidence rate difference.to="only0"
(the default), the value of add
(the default is 1/2) is added to x1i
and x2i
only in the studies that have zero events in one or both groups. When to="all"
, the value of add
is added to x1i
and x2i
in all studies. When to="if0all"
, the value of add
is added to x1i
and x2i
in all studies, but only when there is at least one study with zero events in one or both groups. Setting to="none"
or add=0
has the same effect: No adjustment to the observed number of events is made. Depending on the outcome measure and the data, this may lead to division by zero inside of the function (when this occurs, the resulting value is recoded to NA
). Like for $2 \times 2$ table data, studies where x1i=x2i=0
may be considered to be uninformative about the size of the effect and dropping such studies has sometimes been recommended. This can be done by setting drop00=TRUE
. The counts for such studies will then be set to NA
.
A dataset corresponding to data of this type is provided in dat.hart1999
.
}
m1i
sd1i
n1i
group 2 m2i
sd2i
n2i
} where m1i
and m2i
are the observed means of the two groups, sd1i
and sd2i
the observed standard deviations, and n1i
and n2i
the number of individuals in each group. Again, the two groups may be experimentally created (e.g., a treatment and control group based on random assignment) or naturally occurring (e.g., men and women). In either case, the raw mean difference, the standardized mean difference, and the ratio of means (also called response ratio) are useful outcome measures when meta-analyzing studies of this type (e.g., Borenstein, 2009). In addition, the (log) odds ratio can be estimated based on data of this type, using a simple transformation of the standardized mean difference (e.g., Chinn, 2000; Hasselblad & Hedges, 1995).
The options for the measure
argument are then:
"MD"
for theraw mean difference."SMD"
for thestandardized mean difference."SMDH"
for thestandardized mean differencewithout assuming equal population variances in the two groups (Bonett, 2008, 2009)."ROM"
for thelog transformed ratio of means(Hedges et al., 1999)."D2OR"
for thetransformed standardized mean differenceas an estimate of the log odds ratio.m1i
and m2i
have opposite signs, this outcome measure cannot be computed).
The negative bias in the standardized mean difference is automatically corrected for within the function, yielding Hedges' g for measure="SMD"
(Hedges, 1981). Similarly, the same bias correction is applied for measure="SMDH"
(Bonett, 2009). Finally, for measure="SMD"
, one can choose between vtype="LS"
(the default) and vtype="UB"
. The former uses a large sample approximation to compute the sampling variances. The latter provides unbiased estimates of the sampling variances.
A dataset corresponding to data of this type is provided in dat.normand1999
(for mean differences and standardized mean differences). A dataset showing the use of the ratio of means measure is provided in dat.curtis1998
.
}
}
ri
, the vector with the raw correlation coefficients, and ni
, the corresponding sample sizes. The options for the measure
argument are then:
"COR"
for theraw correlation coefficient."UCOR"
for theraw correlation coefficientcorrected for its slight negative bias (based on equation 2.7 in Olkin & Pratt, 1958)."ZCOR"
for theFisher's r-to-z transformed correlation coefficient(Fisher, 1921).measure="COR"
and measure="UCOR"
, one can choose between vtype="LS"
(the default) and vtype="UB"
. The former uses a large sample approximation to compute the sampling variances. The latter provides approximately unbiased estimates of the sampling variances (see Hedges, 1989).
A dataset corresponding to data of this type is provided in dat.mcdaniel1994
.
}
ai
, bi
, ci
, and di
arguments. The options for the measure
argument are then:
"OR"
for thelog odds ratio."PHI"
for thephi coefficient."YUQ"
forYule's Q(Yule, 1912)."YUY"
forYule's Y(Yule, 1912)."RTET"
for thetetrachoric correlation.m1i
and m2i
for the observed means of the two groups, sd1i
and sd2i
for the observed standard deviations, and n1i
and n2i
for the number of individuals in each group. The options for the measure
argument are then:
"RPB"
for thepoint-biserial correlation."RBIS"
for thebiserial correlation.measure="RPB"
, one must indicate via vtype="ST"
or vtype="CS"
whether the data for the studies were obtained using stratified or cross-sectional (i.e., multinomial) sampling, respectively (it is also possible to specify an entire vector for the vtype
argument in case the sampling schemes differed for the various studies).
}
}
xi
and ni
, denoting the number of individuals experiencing the event of interest and the total number of individuals, respectively. Instead of specifying ni
, one can use mi
to specify the number of individuals that do not experience the event of interest. The options for the measure
argument are then:
"PR"
for theraw proportion."PLN"
for thelog transformed proportion."PLO"
for thelogit transformed proportion(i.e., log odds)."PAS"
for thearcsine transformed proportion."PFT"
for theFreeman-Tukey double arcsine transformed proportion(Freeman & Tukey, 1950).to="only0"
(the default), the value of add
(the default is 1/2) is added to xi
and mi
only for studies where xi
or mi
is equal to 0. When to="all"
, the value of add
is added to xi
and mi
in all studies. When to="if0all"
, the value of add
is added in all studies, but only when there is at least one study with a zero value for xi
or mi
. Setting to="none"
or add=0
has the same effect: No adjustment to the observed values is made. Depending on the outcome measure and the data, this may lead to division by zero inside of the function (when this occurs, the resulting value is recoded to NA
).
A dataset corresponding to data of this type is provided in dat.pritz1997
.
}
xi
and ti
, denoting the total number of events that occurred and the total person-time at risk, respectively. The options for the measure
argument are then:
"IR"
for theraw incidence rate."IRLN"
for thelog transformed incidence rate."IRS"
for thesquare-root transformed incidence rate."IRFT"
for theFreeman-Tukey transformed incidence rate(Freeman & Tukey, 1950).to="only0"
(the default), the value of add
(the default is 1/2) is added to xi
only in the studies that have zero events. When to="all"
, the value of add
is added to xi
in all studies. When to="if0all"
, the value of add
is added to xi
in all studies, but only when there is at least one study with zero events. Setting to="none"
or add=0
has the same effect: No adjustment to the observed number of events is made. Depending on the outcome measure and the data, this may lead to division by zero inside of the function (when this occurs, the resulting value is recoded to NA
).
}
mi
, sdi
, and ni
for the observed means, the observed standard deviations, and the sample sizes, respectively. The only option for the measure
argument is then:
"MN"
for theraw mean.sdi
is used to specify the standard deviations of the observed values of the response, characteristic, or dependent variable and not the standard errors of the means.
A more complicated situation arises when the purpose of the meta-analysis is to assess the amount of change within individual groups. In that case, either the raw mean change or standardized versions thereof can be used as outcome measures (Becker, 1988; Gibbons et al., 1993; Morris, 2000). Here, one needs to specify m1i
and m2i
, the observed means at the two measurement occasions, sd1i
and sd2i
for the corresponding observed standard deviations, ri
for the correlation between the scores observed at the two measurement occasions, and ni
for the sample size. The options for the measure
argument are then:
"MC"
for theraw mean change."SMCC"
for thestandardized mean changeusing change score standardization."SMCR"
for thestandardized mean changeusing raw score standardization.m1i
and m2i
are unknown, but the raw mean change is directly reported in a particular study, then you can set m1i
to that value and m2i
to 0 (making sure that the raw mean change was computed as m1i-m2i
within that study and not the other way around). Also, for the raw mean change ("MC"
) or the standardized mean change using change score standardization ("SMCC"
), if sd1i
, sd2i
, and ri
are unknown, but the standard deviation of the change scores is directly reported, then you can set sd1i
to that value and both sd2i
and ri
to 0. Finally, for the standardized mean change using raw score standardization ("SMCR"
), argument sd2i
is actually not needed, as the standardization is only based on sd1i
(Becker, 1988; Morris, 2000), which is usually the pre-test standard deviation (if the post-test standard deviation should be used, then set sd1i
to that).
}
}
ai
, mi
, and ni
for the observed alpha values, the number of items/replications/parts of the measurement instrument, and the sample sizes, respectively. One can either directly analyze the raw Cronbach's alpha values or transformations thereof (Bonett, 2002, 2010; Hakstian & Whalen, 1976). The options for the measure
argument are then:
"ARAW"
forraw alphavalues."AHW"
fortransformed alpha values(Hakstian & Whalen, 1976)."ABT"
fortransformed alpha values(Bonett, 2002)."AHW"
, the transformation $1-(1-\alpha)^{1/3}$ is used, while for "ABT"
, the transformation $-ln(1-\alpha)$ is used. This ensures that the transformed values are monotonically increasing functions of $\alpha$.
A dataset corresponding to data of this type is provided in dat.bonett2010
.
}
}
escalc
function, the default and a formula interface. When using the default interface, which is described above, the information needed to compute the various outcome measures is passed to the function via the various arguments outlined above (i.e., arguments ai
through ni
).
The formula interface works as follows. As above, the argument measure
is a character string specifying which outcome measure should be calculated. The formula
argument is then used to specify the data structure as a multipart formula. The data
argument can be used to specify a data frame containing the variables in the formula. The add
, to
, and vtype
arguments work as described above.
formula
argument takes the form outcome ~ group | study
, where group
is a two-level factor specifying the rows of the tables, outcome
is a two-level factor specifying the columns of the tables (the two possible outcomes), and study
is a factor specifying the study factor. The weights
argument is used to specify the frequencies in the various cells.
}
formula
argument takes the form events/times ~ group | study
, where group
is a two-level factor specifying the group factor and study
is a factor specifying the study factor. The left-hand side of the formula is composed of two parts, with the first variable for the number of events and the second variable for the person-time at risk.
}
formula
argument takes the form means/sds ~ group | study
, where group
is a two-level factor specifying the group factor and study
is a factor specifying the study factor. The left-hand side of the formula is composed of two parts, with the first variable for the means and the second variable for the standard deviations. The weights
argument is used to specify the sample sizes in the groups.
}
}
formula
argument takes the form outcome ~ 1 | study
, where outcome
is used to specify the observed correlations and study
is a factor specifying the study factor. The weights
argument is used to specify the sample sizes.
}
formula
argument is specified in the same manner.
}
formula
argument is specified in the same manner.
}
}
formula
argument takes the form outcome ~ 1 | study
, where outcome
is a two-level factor specifying the columns of the tables (the two possible outcomes) and study
is a factor specifying the study factor. The weights
argument is used to specify the frequencies in the various cells.
}
formula
argument takes the form events/times ~ 1 | study
, where study
is a factor specifying the study factor. The left-hand side of the formula is composed of two parts, with the first variable for the number of events and the second variable for the person-time at risk.
}
formula
argument takes the form means/sds ~ 1 | study
, where study
is a factor specifying the study factor. The left-hand side of the formula is composed of two parts, with the first variable for the means and the second variable for the standard deviations. The weights
argument is used to specify the sample sizes.
Note: The formula interface is (currently) not implemented for the raw mean change and the standardized mean change measures.
}
}
formula
argument takes the form alpha/items ~ 1 | study
, where study
is a factor specifying the study factor. The left-hand side of the formula is composed of two parts, with the first variable for the Cronbach's alpha values and the second variable for the number of items.
}
}
}print.escalc
, summary.escalc
, rma.uni
, rma.mh
, rma.peto
, rma.glmm
### load BCG vaccine data
data(dat.bcg)
### calculate log relative risks and corresponding sampling variances
dat <- escalc(measure="RR", ai=tpos, bi=tneg, ci=cpos, di=cneg, data=dat.bcg)
dat
### suppose that for a particular study, yi and vi are known (i.e., have
### already been calculated) but the 2x2 table counts are not known; with
### replace=FALSE, the yi and vi values for that study are not replaced
dat[1:12,10:11] <- NA
dat[13,4:7] <- NA
dat <- escalc(measure="RR", ai=tpos, bi=tneg, ci=cpos, di=cneg, data=dat, replace=FALSE)
dat
### using formula interface (first rearrange data into long format)
dat.long <- to.long(measure="RR", ai=tpos, bi=tneg, ci=cpos, di=cneg,
data=dat.bcg, append=FALSE, vlong=TRUE)
escalc(outcome ~ group | study, weights=freq, data=dat.long, measure="RR")
Run the code above in your browser using DataLab