Learn R Programming

dplyr (version 0.5.0)

bench_compare: Evaluate, compare, benchmark operations of a set of srcs.

Description

These functions support the comparison of results and timings across multiple sources.

Usage

bench_tbls(tbls, op, ..., times = 10)

compare_tbls(tbls, op, ref = NULL, compare = equal_data_frame, ...)

eval_tbls(tbls, op)

Arguments

tbls

A list of tbls.

op

A function with a single argument, called often with each element of tbls.

times

For benchmarking, the number of times each operation is repeated.

ref

For checking, an data frame to test results against. If not supplied, defaults to the results from the first src.

compare

A function used to compare the results. Defaults to equal_data_frame which ignores the order of rows and columns.

For compare_tbls: additional parameters passed on the compare function

For bench_tbls: additional benchmarks to run.

Value

eval_tbls: a list of data frames.

compare_tbls: an invisible TRUE on success, otherwise an error is thrown.

bench_tbls: an object of class microbenchmark

See Also

src_local for working with local data

Examples

Run this code

if (require("microbenchmark") && has_lahman()) {
lahman_local <- lahman_srcs("df", "sqlite")
teams <- lapply(lahman_local, function(x) x %>% tbl("Teams"))

compare_tbls(teams, function(x) x %>% filter(yearID == 2010))
bench_tbls(teams, function(x) x %>% filter(yearID == 2010))

# You can also supply arbitrary additional arguments to bench_tbls
# if there are other operations you'd like to compare.
bench_tbls(teams, function(x) x %>% filter(yearID == 2010),
   base = subset(Lahman::Teams, yearID == 2010))

# A more complicated example using multiple tables
setup <- function(src) {
  list(
    src %>% tbl("Batting") %>% filter(stint == 1) %>% select(playerID:H),
    src %>% tbl("Master") %>% select(playerID, birthYear)
  )
}
two_tables <- lapply(lahman_local, setup)

op <- function(tbls) {
  semi_join(tbls[[1]], tbls[[2]], by = "playerID")
}
# compare_tbls(two_tables, op)
bench_tbls(two_tables, op, times = 2)

}

Run the code above in your browser using DataLab