Learn R Programming

expss (version 0.8.11)

by_groups: Aggregate dataset by grouping variable(s).

Description

Splits the data by groups, computes summary statistics for each, and returns data.frame/data.table.

Usage

by_groups(data, ...)

Arguments

data

data for aggregation

...

aggregation parameters. Character/numeric or criteria/logical functions (see criteria) for grouping variables. Names of variables at the top-level can be unquoted (non-standard evaluation). For standard evaluation of parameters, you can surround them by round brackets. You need additionally specify formulas with aggregation expressions, such as mean_x ~ mean(x). Instead of the formulas it can be single function as last argument - it will be applied to all non-grouping columns. See examples.

Value

aggregated data.frame/data.table

Examples

Run this code
# NOT RUN {
# compute mean of the every column for every value of the Species
data(iris)
by_groups(iris, Species, mean)

# compute mean of the every numeric column
iris %>% except(Species) %>% by_groups(mean)

# compute different functions for different columns
# automatic naming
data(mtcars)
by_groups(mtcars, cyl, am, ~ mean(hp), ~ median(mpg))

# with custom names
by_groups(mtcars, cyl, am, mean_hp ~ mean(hp), median_mpg ~ median(mpg))

# variable substitution
group1 = "cyl"
statistic1 = ~ mean(hp)
by_groups(mtcars, (group1), (statistic1))

group2 = "am"
# formulas can be easily constructed from text strings
statistic2 = as.formula("~ median(mpg)") 
by_groups(mtcars, (group2), (statistic2))

by_groups(mtcars, (group1), (group2), (statistic1), (statistic2))

# }

Run the code above in your browser using DataLab