Learn R Programming

dplyr (version 0.1.1)

chain: Chain together multiple operations.

Description

The downside of the functional nature of dplyr is that when you combine multiple data manipulation operations, you have to read from the inside out and the arguments may be very distant to the function call. These functions providing an alternative way of calling dplyr (and other data manipulation) functions that you read can from left to right.

Usage

chain(..., env = parent.frame())

chain_q(calls, env = parent.frame())

x %.% y

Arguments

x,y
A dataset and function to apply to it
...,calls
A sequence of data transformations, starting with a dataset. The first argument of each call should be omitted - the value of the previous step will be substituted in automatically. Use chain and ... when working interactive; use chain_q and calls when calling from another function.
env
Environment in which to evaluation expressions. In ordinary operation you should not need to set this parameter.

Details

The functions work via simple substitution so that chain(x, f(y)) or x %.% f(y) is translated into f(x, y).

Examples

Run this code
if (require("hflights")) {
# If you're performing many operations you can either do step by step
a1 <- group_by(hflights, Year, Month, DayofMonth)
a2 <- select(a1, Year:DayofMonth, ArrDelay, DepDelay)
a3 <- summarise(a2,
  arr = mean(ArrDelay, na.rm = TRUE),
  dep = mean(DepDelay, na.rm = TRUE))
a4 <- filter(a3, arr > 30 | dep > 30)

# If you don't want to save the intermediate results, you need to
# wrap the functions:
filter(
  summarise(
    select(
      group_by(hflights, Year, Month, DayofMonth),
      Year:DayofMonth, ArrDelay, DepDelay
    ),
    arr = mean(ArrDelay, na.rm = TRUE),
    dep = mean(DepDelay, na.rm = TRUE)
  ),
  arr > 30 | dep > 30
)

# This is difficult to read because the order of the operations is from
# inside to out, and the arguments are a long way away from the function.
# Alternatively you can use chain or %.% to sequence the operations
# linearly:

hflights %.%
  group_by(Year, Month, DayofMonth) %.%
  select(Year:DayofMonth, ArrDelay, DepDelay) %.%
  summarise(
    arr = mean(ArrDelay, na.rm = TRUE),
    dep = mean(DepDelay, na.rm = TRUE)
  ) %.%
  filter(arr > 30 | dep > 30)

chain(
  hflights,
  group_by(Year, Month, DayofMonth),
  select(Year:DayofMonth, ArrDelay, DepDelay),
  summarise(
    arr = mean(ArrDelay, na.rm = TRUE),
    dep = mean(DepDelay, na.rm = TRUE)
  ),
  filter(arr > 30 | dep > 30)
)
}

Run the code above in your browser using DataLab