to.long: Convert Data from Vector to Long Format

Description

The function can be used to convert summary data in vector format to the corresponding long format.

Usage

to.long(measure, ai, bi, ci, di, n1i, n2i, x1i, x2i, t1i, t2i,
        m1i, m2i, sd1i, sd2i, xi, mi, ri, ti, sdi, ni, data, slab, subset,
        add=1/2, to="none", drop00=FALSE, vlong=FALSE, append=TRUE, var.names)

Arguments

measure

a character string indicating the effect size or outcome measure corresponding to the summary data supplied. See below and the documentation of the escalc function for more details.

vector to specify the $2 \times 2$ table frequencies (upper left cell).

vector to specify the $2 \times 2$ table frequencies (upper right cell).

vector to specify the $2 \times 2$ table frequencies (lower left cell).

vector to specify the $2 \times 2$ table frequencies (lower right cell).

n1i

vector to specify the group sizes or row totals (first group/row).

n2i

vector to specify the group sizes or row totals (second group/row).

x1i

vector to specify the number of events (first group).

x2i

vector to specify the number of events (second group).

t1i

vector to specify the total person-times (first group).

t2i

vector to specify the total person-times (second group).

m1i

vector to specify the means (first group or time point).

m2i

vector to specify the means (second group or time point).

sd1i

vector to specify the standard deviations (first group or time point).

sd2i

vector to specify the standard deviations (second group or time point).

vector to specify the frequencies of the event of interest.

vector to specify the frequencies of the complement of the event of interest or the group means.

vector to specify the raw correlation coefficients.

vector to specify the total person-times.

sdi

vector to specify the standard deviations.

vector to specify the sample/group sizes.

data

optional data frame containing the variables given to the arguments above.

slab

optional vector with labels for the studies.

subset

optional vector indicating the subset of studies that should be used. This can be a logical vector or a numeric vector indicating the indices of the studies to include.

add

see the documentation of the escalc function.

drop00

see the documentation of the escalc function.

vlong

optional logical whether a very long format should be used (only relevant for $2 \times 2$ or $1 \times 2$ table data).

append

logical indicating whether the data frame specified via the data argument (if one has been specified) should be returned together with the long format data (default is TRUE).

var.names

optional vector with variable names (length depends on the data type). If unspecified, the function sets appropriate variable names by default.

Value

A data frame with either $k$, $2k$, or $4k$ rows and an appropriate number of columns (depending on the data type) with the data in long format. If append=TRUE and a data frame was specified via the data argument, then the data in long format are appended to the original data frame (with rows repeated an appropriate number of times).

Details

The escalc function describes a wide variety of effect size and outcome measures that can be computed for a meta-analysis. The summary data used to compute those measures are typically contained in vectors, each element corresponding to a study. The to.long function takes this information and constructs a long format dataset from these data. For example, in various fields (such as the health and medical sciences), the response variable measured is often dichotomous (binary), so that the data from a study comparing two different groups can be expressed in terms of a $2 \times 2$ table, such as: lccc{ outcome 1 outcome 2 total group 1 ai bi n1i group 2 ci di n2i } where ai, bi, ci, and di denote the cell frequencies (i.e., the number of people falling into a particular category) and n1i and n2i the row totals (i.e., the group sizes). The cell frequencies in $k$ such $2 \times 2$ tables can be specified via the ai, bi, ci, and di arguments (or alternatively, via the ai, ci, n1i, and n2i arguments). The function then creates the corresponding long format dataset. The measure argument should then be set equal to one of the outcome measures that can be computed based on this type of data, such as "RR", "OR", "RD" (it is not relevant which specific measure is chosen, as long as it corresponds to the specified summary data). See the documentation of the escalc function for more details on the types of data formats available. The long format for data of this type consists of two rows per study, a factor indicating the study (default name study), a dummy variable indicating the group (default name group, coded as 1 and 2), and two variables indicating the number of individuals experiencing outcome 1 or outcome 2 (default names out1 and out2). Alternatively, if vlong=TRUE, then the long format consists of four rows per study, a factor indicating the study (default name study), a dummy variable indicating the group (default name group, coded as 1 and 2), a dummy variable indicating the outcome (default name outcome, coded as 1 and 2), and a variable indicating the frequency of the respective outcome (default name freq). The default variable names can be changed via the var.names argument (must be of the appropriate length, depending on the data type). The examples below illustrate the use of this function.

References

Viechtbauer, W. (2010). Conducting meta-analyses in R with the metafor package. Journal of Statistical Software, 36(3), 1--48. http://www.jstatsoft.org/v36/i03/.

Examples

Run this code

### load BCG vaccine data
data(dat.bcg)

### convert data to long format
dat <- to.long(measure="OR", ai=tpos, bi=tneg, ci=cpos, di=cneg, data=dat.bcg)
dat

### extra long format
dat <- to.long(measure="OR", ai=tpos, bi=tneg, ci=cpos, di=cneg, data=dat.bcg, vlong=TRUE)
dat

### load data from Hart et al. (1999)
data(dat.hart1999)

### convert data to long format
dat <- to.long(measure="IRR", x1i=x1i, x2i=x2i, t1i=t1i, t2i=t2i,
               data=dat.hart1999, var.names=c("id", "group", "events", "ptime"))
dat

### load data from Normand (1999)
data(dat.normand1999)

### convert data to long format
dat <- to.long(measure="MD", m1i=m1i, sd1i=sd1i, n1i=n1i,
               m2i=m2i, sd2i=sd2i, n2i=n2i, data=dat.normand1999,
               var.names=c("id", "group", "mean", "sd", "n"))
dat

Run the code above in your browser using DataLab