escalc: Calculate Effect Size and Outcome Measures

Description

The function can be used to calculate various effect size or outcome measures (and the corresponding sampling variances) that are commonly used in meta-analyses.

Usage

escalc(measure, formula, ...)

## S3 method for class 'default':
escalc(measure, formula, ai, bi, ci, di, n1i, n2i, 
       m1i, m2i, sd1i, sd2i, xi, mi, ri, ni, data, 
       add=1/2, to="only0", vtype="LS", append=FALSE, ...)

## S3 method for class 'formula':
escalc(measure, formula, weights, data, 
       add=1/2, to="only0", vtype="LS", ...)

Arguments

measure

a character string indicating which effect size or outcome measure should be calculated. See Details for possible options and how the data should then be specified.

formula

when using the formula interface of the function (see Details below), a model formula specifying the data structure should be specified via this argument. When not using the formula interface, this argument can be ignored and the data req

weights

vector of weights to specify the group sizes or cell frequencies (only needed when using the formula interface). See Details below.

vector to specify 2x2 table frequencies (upper left cell).

vector to specify 2x2 table frequencies (upper right cell).

vector to specify 2x2 table frequencies (lower left cell).

vector to specify 2x2 table frequencies (lower right cell).

n1i

vector to specify group sizes or row total (first group/row).

n2i

vector to specify group sizes or row total (second group/row).

m1i

vector to specify means (first group).

m2i

vector to specify means (second group).

sd1i

vector to specify standard deviations (first group).

sd2i

vector to specify standard deviations (second group).

vector to specify the frequencies of the event of interest.

vector to specify the frequencies of the complement of the event of interest.

vector to specify the raw correlation coefficients.

vector to specify the sample sizes.

data

an optional data frame containing the variables given to the arguments above.

add

See Details.

vtype

See Details.

append

logical indicating whether the data frame specified via the data argument (if one has been specified) should be returned together with the effect sizes and sampling variances (default is FALSE).

...

other arguments.

Value

A data frame with the following elements:
yivalue of the effect size or outcome measure.
vicorresponding (estimated) sampling variance.
If append=TRUE and a data frame was specified via the data argument, then yi and vi are append to this data frame.

Details

There are two interfaces to using the escalc function, the default and a formula interface. The two interfaces are described below. Default Interface{ The default interface works as follows. The argument measure is a character string specifying which outcome measure should be calculated (see below for the various options), arguments ai through ni are then used to supply the needed information to calculate the various measures (depending on the outcome measure, different arguments need to be supplied), and data can be used to specify a data frame containing the variables given to the previous arguments. The add and to arguments may be needed when dealing with 2x2 table data that contain cells with zeros. Finally, the vtype argument is used to specify how to calculate the sampling variance estimate (see below). Effect Size and Outcome Measures for 2x2 Table Data{ Meta-analyses in the health/medical sciences are often based on studies providing data in terms of 2x2 tables. In particular, assume that we have $k$ tables of the form: lccc{ outcome 1 outcome 2 total group 1 ai bi n1i group 2 ci di n2i } where ai, bi, ci, and di denote the cell frequencies and n1i and n2i the row totals. For example, in a set of randomized clinical trials, group 1 and group 2 may refer to the treatment and placebo/control group, with outcome 1 denoting some event of interest (e.g., remission) and outcome 2 its complement. In a set of case-control studies, group 1 and group 2 may refer to the group of cases and the group of controls, with outcome 1 denoting, for example, exposure to some risk factor and outcome 2 non-exposure. The 2x2 table may also be the result of cross-sectional (i.e., multinomial) sampling, so that none of the table margins (except the total sample size n1i+n2i) are fixed through the study design. Depending on the type of design (sampling method), a meta-analysis of 2x2 table data can be based on one of several different outcome measures, including the odds ratio, the relative risk (also called risk ratio), the risk difference, and the arcsine transformed risk difference. The phi coefficient, Yule's Q, and Yule's Y are additional measures of association for 2x2 table data (but they may not be the most ideal choices for meta-analyses of such data). For these measures, one needs to supply either ai, bi, ci, and di or alternatively ai, ci, n1i, and n2i. The options for the measure argument are then:

"RR": Thelog relative riskis equal to the log of(ai/n1i)/(ci/n2i).
"OR": Thelog odds ratiois equal to the log of(ai*di)/(bi*ci).
"RD": Therisk differenceis equal to(ai/n1i)-(ci/n2i).
"AS": Thearcsine transformed risk differenceis equal toasin(sqrt(ai/n1i)) -asin(sqrt(ci/n2i)). See Ruecker et al. (2009) for a discussion of this and other outcome measures for 2x2 table data.
"PETO": Thelog odds ratio estimated with Peto's method(see Yusuf et al., 1985) is equal to(ai-si*n1i/ni)/((si*ti*n1i*n2i)/(ni^2*(ni-1))), wheresi=ai+ci,ti=bi+di, andni=n1i+n2i. Note that this measure technically assumes that the true odds ratio is equal to 1 in all tables.
"PHI": Thephi coefficientis equal to(ai*di-bi*ci)/sqrt(n1i*n2i*si*ti), wheresi=ai+ciandti=bi+di.
"YUQ":Yule's Qis equal to(oi-1)/(oi+1), whereoiis the odds ratio.
"YUY":Yule's Yis equal to(sqrt(oi)-1)/(sqrt(oi)+1), whereoiis the odds ratio.

Note that the log is taken of the relative risk and the odds ratio, which makes these outcome measures symmetric around 0 and helps to make the distribution of these outcome measure closer to normal. Cell entries with a zero can be problematic, especially for the relative risk and the odds ratio. Adding a small constant to the cells of the 2x2 tables is a common solution to this problem. When to="all", the value of add is added to each cell of the 2x2 tables in all $k$ tables. When to="only0", the value of add is added to each cell of the 2x2 tables only in those tables with at least one cell equal to 0. When to="if0all", the value of add is added to each cell of the 2x2 tables in all $k$ tables, but only when there is at least one 2x2 table with a zero entry. Setting to="none" or add=0 has the same effect: No adjustment to the observed table frequencies is made. Depending on the outcome measure and the data, this may lead to division by zero inside of the function (when this occurs, the resulting Inf value is recoded to NA). } Raw and Standardized Mean Difference{ The raw mean difference and standardized mean difference are useful effect size measures when meta-analyzing a set of studies comparing two experimental groups (e.g., treatment and control groups) or two naturally occurring groups (e.g., men and women) with respect to some quantitative (and ideally normally distributed) dependent variable. For these outcome measures, m1i and m2i specify the means of the two groups, sd1i and sd2i the standard deviations of the scores in the two groups, and n1i and n2i the sample sizes of the two groups.

"MD": Theraw mean differenceis equal tom1i-m2i.
"SMD": Thestandardized mean differenceis equal to(m1i-m2i)/spi, wherespiis the pooled standard deviation of the two groups (which is calculated inside of the function). The standardized mean difference is automatically corrected for its slight positive bias within the function (see Hedges & Olkin, 1985). Whenvtype="LS", the sampling variances are calculated based on the large sample approximation. Alternatively, the unbiased estimates of the sampling variances can be obtained withvtype="UB".

} Raw and Transformed Correlation Coefficient{ Another frequently used outcome measure in meta-analyses is the correlation coefficient, which is used to measure the strength of the (linear) relationship between two quantitative variables. Here, one needs to specify ri, the vector with the raw correlation coefficients, and ni, the corresponding sample sizes.

"COR": Theraw correlation coefficientis simply equal torias supplied to the function. Whenvtype="LS", the sampling variances are calculated based on the large sample approximation. Alternatively, an approximation to the unbiased estimates of the sampling variances can be obtained withvtype="UB"(see Hedges, 1989).
"UCOR": Theunbiased estimate of the correlation coefficientis obtained by correcting the raw correlation coefficient for its slight negative bias (based on equation 2.7 in Olkin & Pratt, 1958). Again,vtype="LS"andvtype="UB"can be used to choose between the large sample approximation or approximately unbiased estimates of the sampling variances.
"ZCOR": Fisher's r-to-z transformation is a variance stabilizing transformation for correlation coefficients with the added benefit of also being a rather effective normalizing transformation (Fisher, 1921). TheFisher's r-to-z transformed correlation coefficientis equal to1/2*log((1+ri)/(1-ri)).

} Proportions and Transformations Thereof{ When the studies provide data for single groups with respect to a dichotomous dependent variable, then the raw proportion, the logit transformed proportion, the arcsine transformed proportion, and the Freeman-Tukey double arcsine transformed proportion are useful outcome measures (the log transformed proportion is also a possibility, but not frequently used in meta-analyses). Here, one needs to specify xi and ni, denoting the number of individuals experiencing the event of interest and the total number of individuals, respectively. Instead of specifying ni, one can use mi to specify the number of individuals that do not experience the event of interest.

"PR": Theraw proportionis equal toxi/ni.
"PLN": Thelog transformed proportionis equal to the log ofxi/ni.
"PLO": Thelogit transformed proportionis equal to the log ofxi/(ni-xi).
"PAS": The arcsine transformation is a variance stabilizing transformation for proportions and is equal toasin(sqrt(xi/ni)).
"PFT": Yet another variance stabilizing transformation for proportions was suggested by Freeman & Tukey (1950). TheFreeman-Tukey double arcsine transformed proportionis equal to1/2*(asin(sqrt(xi/(ni+1))) + asin(sqrt((xi+1)/(ni+1)))).

Again, zero cell entries can be problematic. When to="all", the value of add is added to xi and mi in all $k$ studies. When to="only0", the value of add is added only for studies where the xi or mi is equal to 0. When to="if0all", the value of add is added in all $k$ studies, but only when there is at least one study with a zero value for xi or mi. Setting to="none" or add=0 again means that no adjustment to the observed values is made. } } Formula Interface{ The formula interface works as follows. As above, the argument measure is a character string specifying which outcome measure should be calculated. The formula argument is then used to specify the data structure as a multipart formula together with the weights argument for the group sizes or cell frequencies. The data argument can be used to specify a data frame containing the variables in the formula and the weights variable. The add, to, and vtype arguments work as described above. The formula argument takes the form outcome ~ group | study, where group is a factor specifying the group variable and study is a factor specifying the study factor. Effect Size and Outcome Measures for 2x2 Table Data{ For 2x2 table data, group is a two-level factor specifying the rows of the tables and outcome is a two-level factor specifying the columns of the tables (the two possible outcomes). The weights argument is then used to specify the frequencies in the various cells. } Raw and Standardized Mean Difference{ For these outcome measures, group is a two-level factor specifying the group factor and the left-hand side of the formula is composed of two parts, with the first variable for the means and the second variable for the standard deviations (i.e., means + sds ~ group | study). The weights argument is then used to specify the sample sizes in the groups. } Raw and Transformed Correlation Coefficient{ For these outcome measures, group is a one-level factor and outcome is used to specify the observed correlations. The weights argument is again used to specify the sample sizes. } Proportions and Transformations Thereof{ For these outcome measures, group is a one-level factor and outcome is a two-level factor specifying the columns of the tables (the two possible outcomes). The weights argument is then used to specify the frequencies in the various cells. } }

References

Cooper, H. C., Hedges, L. V. & Valentine, J. C. (Eds.) (2009). The handbook of research synthesis and meta-analysis (2nd ed.). New York: Russell Sage Foundation. Fisher, R. A. (1921). On the probable error of a coefficient of correlation deduced from a small sample. Metron, 1, 1--32. Freeman, M. F. & Tukey, J. W. (1950). Transformations related to the angular and the square root. Annals of Mathematical Statistics, 21, 607--611. Hedges, L. V. (1989). An unbiased correction for sampling error in validity generalization studies. Journal of Applied Psychology, 74, 469--477. Hedges, L. V. & Olkin, I. (1985). Statistical methods for meta-analysis. San Diego, CA: Academic Press. Ruecker, G., Schwarzer, G., Carpenter, J. & Olkin, I. (2009). Why add anything to nothing? The arcsine difference as a measure of treatment effect in meta-analysis with zero cells. Statistics in Medicine, 28, 721--738. Olkin, I. & Pratt, J. W. (1958). Unbiased estimation of certain correlation coefficients. Annals of Mathematical Statistics, 29, 201--211. Viechtbauer, W. (2010). Conducting meta-analyses in R with the metafor package. Journal of Statistical Software, 36(3), 1--48. http://www.jstatsoft.org/v36/i03/. Yusuf, S., Peto, R., Lewis, J., Collins, R. & Sleight, P. (1985). Beta blockade during and after myocardial infarction: An overview of the randomized trials. Progress in Cardiovascular Disease, 27, 335--371.

Examples

Run this code

### load BCG vaccine data
data(dat.bcg)

### calculate log relative risks and corresponding sampling variances
dat <- escalc(measure="RR", ai=tpos, bi=tneg, ci=cpos, di=cneg, 
              data=dat.bcg, append=TRUE)
dat

### using formula interface (first rearrange data into required format)
k <- length(dat.bcg$trial)
dat.fm <- data.frame(study=factor(rep(1:k, each=4)))
dat.fm$grp  <- factor(rep(c("T","T","C","C"), k), levels=c("T", "C"))
dat.fm$out  <- factor(rep(c("+","-","+","-"), k), levels=c("+", "-"))
dat.fm$freq <- with(dat.bcg, c(rbind(tpos, tneg, cpos, cneg)))
dat.fm

escalc(out ~ grp | study, weights=freq, data=dat.fm, measure="RR")

Run the code above in your browser using DataLab