Learn R Programming

tidyselect (version 1.0.0)

select_helpers: Select helpers

Description

These functions allow you to select variables based on their names.

  • starts_with(): Starts with a prefix.

  • ends_with(): Ends with a suffix.

  • contains(): Contains a literal string.

  • matches(): Matches a regular expression.

  • num_range(): Matches a numerical range like x01, x02, x03.

  • all_of(): Matches variable names in a character vector. All names must be present, otherwise an out-of-bounds error is thrown.

  • any_of(): Same as all_of(), except that no error is thrown for names that don't exist.

  • everything(): Matches all variables.

  • last_col(): Select last variable, possibly with an offset.

Usage

starts_with(match, ignore.case = TRUE, vars = peek_vars(fn = "starts_with"))

ends_with(match, ignore.case = TRUE, vars = peek_vars(fn = "ends_with"))

contains(match, ignore.case = TRUE, vars = peek_vars(fn = "contains"))

matches( match, ignore.case = TRUE, perl = FALSE, vars = peek_vars(fn = "matches") )

num_range(prefix, range, width = NULL, vars = peek_vars(fn = "num_range"))

all_of(x)

any_of(x, ..., vars = peek_vars(fn = "any_of"))

everything(vars = peek_vars(fn = "everything"))

last_col(offset = 0L, vars = peek_vars(fn = "last_col"))

Arguments

match

A character vector. If length > 1, the union of the matches is taken.

ignore.case

If TRUE, the default, ignores case when matching names.

vars

A character vector of variable names. When called from inside selecting functions like dplyr::select() these are automatically set to the names of the table.

perl

Should Perl-compatible regexps be used?

prefix

A prefix that starts the numeric range.

range

A sequence of integers, like 1:5.

width

Optionally, the "width" of the numeric range. For example, a range of 2 gives "01", a range of three "001", etc.

x

An index vector of names or locations.

...

These dots are for future extensions and must be empty.

offset

Set it to n to select the nth var from the end.

Value

An integer vector giving the position of the matched variables.

Details

In selection context you can also use these operators:

  • "/" for taking the difference between two sets of variables.

  • ":" for selecting a range of consecutive variables.

  • "c" for selecting the union of sets of variables.

The boolean operators were more recently overloaded to operate on selections:

  • "!" for taking the complement of a set of variables.

  • "&" and "|" for selecting the intersection or the union of two sets of variables.

The order of selected columns is determined by the inputs.

  • one_of(c("foo", "bar")) selects "foo" first.

  • c(starts_with("c"), starts_with("d")) selects all columns starting with "c" first, then all columns starting with "d".

Examples

Run this code
# NOT RUN {
nms <- names(iris)
vars_select(nms, starts_with("Petal"))
vars_select(nms, ends_with("Width"))
vars_select(nms, contains("etal"))
vars_select(nms, matches(".t."))
vars_select(nms, Petal.Length, Petal.Width)
vars_select(nms, everything())
vars_select(nms, last_col())
vars_select(nms, last_col(offset = 2))

# With multiple matchers, the union of the matches is selected:
vars_select(nms, starts_with(c("Petal", "Sepal")))

# `!` negates a selection:
vars_select(nms, !ends_with("Width"))

# `&` and `|` take the intersection or the union of two selections:
vars_select(nms, starts_with("Petal") & ends_with("Width"))
vars_select(nms, starts_with("Petal") | ends_with("Width"))

# `/` takes the difference of two selections
vars_select(nms, starts_with("Petal") / ends_with("Width"))

# `all_of()` selects the variables in a character vector:
vars <- c("Petal.Length", "Petal.Width")
vars_select(nms, all_of(vars))

# Whereas `all_of()` is strict, `any_of()` allows missing
# variables.
try(vars_select(nms, all_of(c("Species", "Genres"))))
vars_select(nms, any_of(c("Species", "Genres")))

# The lax variant is especially useful to make sure a variable is
# selected out:
vars_select(nms, -any_of(c("Species", "Genres")))

# The order of selected columns is determined from the inputs
vars_select(names(mtcars), starts_with("c"), starts_with("d"))
vars_select(names(mtcars), one_of(c("carb", "mpg")))
# }

Run the code above in your browser using DataLab