shuffle_hierarchy: Shuffle multi-column hierarchy of groups

Description

lifecycle::badge("experimental")

Shuffles a tree/hierarchy of groups, one column at a time. The levels in the last ("leaf") column are shuffled first, then the second-last column, and so on. Elements of the same group are ordered sequentially.

Usage

shuffle_hierarchy(
  data,
  group_cols,
  cols_to_shuffle = group_cols,
  leaf_has_groups = TRUE
)

Value

The shuffled data.frame (tibble).

Arguments

data

data.frame.

group_cols

Names of columns making up the group hierarchy. The last column is the leaf and is shuffled first (if also in `cols_to_shuffle`).

cols_to_shuffle

Names of columns to shuffle hierarchically. By default, all the `group_cols` are shuffled.

leaf_has_groups

Whether the leaf column contains groups or values. (Logical)

When the elements are group identifiers, they are ordered sequentially and shuffled together.

When the elements are values, they are simply shuffled.

Author

Ludvig Renbo Olsen, r-pkgs@ludvigolsen.dk

Examples

Run this code

# Attach packages
library(rearrr)
library(dplyr)

df <- data.frame(
  'a' = rep(1:4, each = 4),
  'b' = rep(1:8, each = 2),
  'c' = 1:16
)

# Set seed for reproducibility
set.seed(2)

# Shuffle all columns
shuffle_hierarchy(df, group_cols = c('a', 'b', 'c'))

# Don't shuffle 'b' but keep grouping by it
# So 'c' will be shuffled within each group in 'b'
shuffle_hierarchy(
  data = df,
  group_cols = c('a', 'b', 'c'),
  cols_to_shuffle = c('a', 'c')
)

# Shuffle 'b' as if it's not a group column
# so elements are independent within their group
# (i.e. same-valued elements are not necessarily ordered sequentially)
shuffle_hierarchy(df, group_cols = c('a', 'b'), leaf_has_groups = FALSE)