Reshape: Reshape Wide Data Into a Semi-long Form

Description

The stats::reshape() function in base R is very handy when you want a semi-long (or semi-wide) data.frame. However, base R's reshape has problems is with "unbalanced" panel data, for instance data where one variable was measured at three points in time, and another only twice.

Usage

Reshape(data, id.vars = NULL, var.stubs, sep = ".", rm.rownames, ...)

Arguments

data

The source data.frame.

id.vars

The variables that serve as unique identifiers. Defaults to NULL, at which point, all names which are not identified as variable groups are used as the identifiers.

var.stubs

The prefixes of the variable groups.

sep

The character that separates the "variable name" from the "times" in the wide data.frame.

rm.rownames

Ignored as data.tables do not have rownames anyway.

…

Further arguments to NoSep() in case the separator is of a different form.

Value

A "long" data.table of the reshaped data that retains the attributes added by base R's reshape function.

Details

This function was written to overcome that limitation of dealing with unbalanced data, but is also appropriate for basic wide-to-long reshaping tasks.

Related functions like utils::stack() in base R and reshape2::melt() in "reshape2" are also very handy when you want a "long" reshaping of data, but they result in a very long structuring of your data, not the "semi-wide" format that reshape produces. data.table::melt() can produce output like reshape, but it also expects an equal number of measurements for each variable.

Examples

Run this code

# NOT RUN {
set.seed(1)
mydf <- data.frame(id_1 = 1:6, id_2 = c("A", "B"), varA.1 = sample(letters, 6),
                 varA.2 = sample(letters, 6), varA.3 = sample(letters, 6),
                 varB.2 = sample(10, 6), varB.3 = sample(10, 6),
                 varC.3 = rnorm(6))
mydf

## Note that these data are unbalanced
## reshape() will not work
# }
# NOT RUN {
reshape(mydf, direction = "long", idvar=1:2, varying=3:ncol(mydf))
# }
# NOT RUN {
## The Reshape() function can handle such scenarios

Reshape(mydf, id.vars = c("id_1", "id_2"),
       var.stubs = c("varA", "varB", "varC"))

# }

Run the code above in your browser using DataLab