Takes levels (labels, factor levels) and corresponding counts and "lumps" according to specified criteria (either n or prop), i.e. preserves some rows and summarises the rest in a single "Other" row
lump(
levels,
count,
n,
prop,
other_level = "Other",
ties.method = c("min", "average", "first", "last", "random", "max")
)
Vector of levels
Vector of corresponding counts
If specified, n rows shall be preserved.
If specified, rows shall be preserved if their count >= prop
Name of the "other" level to be created from lumped rows
Method to apply in case of ties
A dictionary (named vector) of levels -> new levels