Learn R Programming

plyr (version 1.5.2)

dlply: Split data frame, apply function, and return results in a list.

Description

Split data frame, apply function, and return results in a list. For each subset of a data frame, apply function then combine results into a list

Usage

dlply(.data, .variables, .fun, ..., .progress="none",
    .drop=TRUE, .parallel=FALSE)

Arguments

.data
data frame to be processed
.variables
variables to split data frame by, as quoted variables, a formula or character vector
.fun
function to apply to each piece
...
other arguments passed on to .fun
.progress
name of the progress bar to use, see create_progress_bar
.drop
should combinations of variables that do not appear in the data be preserved (FALSE) or dropped (TRUE, default)
.parallel
if TRUE, apply function in parallel, using parallel backend provided by foreach

Value

  • if results are atomic with same type and dimensionality, a vector, matrix or array; otherwise, a list-array (a list with dimensions)

Details

All plyr functions use the same split-apply-combine strategy: they split the input into simpler pieces, apply .fun to each piece, and then combine the pieces into a single data structure. This function splits data frames by variables and combines the result into a list. If there are no results, then this function will return a list of length 0 (list()).

dlply is similar to by except that the results are returned in a different format.

References

Hadley Wickham (2011). The Split-Apply-Combine Strategy for Data Analysis. Journal of Statistical Software, 40(1), 1-29. http://www.jstatsoft.org/v40/i01/.

Examples

Run this code
linmod <- function(df) lm(rbi ~ year, data = mutate(df, year = year - min(year)))
models <- dlply(baseball, .(id), linmod)
models[[1]]

coef <- ldply(models, coef)
with(coef, plot(`(Intercept)`, year))
qual <- laply(models, function(mod) summary(mod)$r.squared)
hist(qual)

Run the code above in your browser using DataLab