by_groups: Aggregate dataset by grouping variable(s).

Description

Splits the data by groups, computes summary statistics for each, and returns data.frame/data.table. %by_groups% is infix version of the function.

Usage

by_groups(data, ...)
data %by_groups% args

Arguments

data

data for aggregation

...

aggregation parameters. It should be names of variables in quotes (characters, e. g. 'Species') and formulas with aggregation expressions, such as mean_x ~ mean(x). Instead of the formulas it can be single function as last argument - it will be applied to all non-grouping columns. Note that there is no non-standard evaluation by design so use quotes for names of your variables or use qc.

args

list The same as ... but for infix the version %by_groups%.

Value

aggregated data.frame/data.table

Examples

Run this code

# NOT RUN {
# compute mean of the every column for every value of the Species
data(iris)
by_groups(iris, "Species", mean)

# compute mean of the every numeric column
by_groups(iris %except% "Species", mean)

# compute different functions for different columns
# automatic naming
data(mtcars)
by_groups(mtcars, "cyl", "am", ~ mean(hp), ~ median(mpg))

# with custom names
by_groups(mtcars, "cyl", "am", mean_hp ~ mean(hp), median_mpg ~ median(mpg))

# 'qc' usage to avoide quotes
by_groups(mtcars, qc(cyl, am), ~ mean(hp), ~ median(mpg))

# variable substitution
group1 = "cyl"
statistic1 = as.formula("~ mean(hp)")
by_groups(mtcars, group1, statistic1)

group2 = "am"
statistic2 = as.formula("~ median(mpg)")
by_groups(mtcars, group2, statistic2)

by_groups(mtcars, group1, group2, statistic1, statistic2)

# infix version
iris %by_groups% c("Species", mean)

mtcars %by_groups%  c("cyl", "am", mean_hp ~ mean(hp), median_mpg ~ median(mpg))

# }

Run the code above in your browser using DataLab