Learn R Programming

dplyr (version 0.1.1)

do: Apply a function to a tbl

Description

This is a general purpose complement to the specialised manipulation functions filter, select, mutate, summarise and arrange.

Usage

do(.data, .f, ...)

# S3 method for tbl_sql do(.data, .f, ..., .chunk_size = 10000L)

Arguments

.data
a tbl
.f
a function to apply to each piece. The first unnamed argument supplied to .f will be a data frame.
...
other arguments passed on to the function ()
.chunk_size
The size of each chunk to pull into R. If this number is too big, the process will be slow because R has to allocate and free a lot of memory. If it's too small, it will be slow, because of the overhead of talking to the database.

Examples

Run this code
if (require("hflights")) {
by_dest <- group_by(hflights, Dest)
do(by_dest, nrow)
# Inefficient version of
group_size(by_dest)

# You can use it to do any arbitrary computation, like fitting a linear
# model. Let's explore how carrier departure delays vary over the course
# of a year
jan <- filter(hflights, Month == 1)
jan <- mutate(jan, date = ISOdate(Year, Month, DayofMonth))
carriers <- group_by(hflights, UniqueCarrier)
group_size(carriers)

mods <- do(carriers, failwith(NULL, lm), formula = ArrDelay ~ date)
sapply(mods, coef)
}

Run the code above in your browser using DataLab