With selecting functions like dplyr::select()
or
tidyr::pivot_longer()
, you can refer to variables by name:
mtcars %>% select(cyl, am, vs) #> # A tibble: 32 x 3 #> cyl am vs #> <dbl> <dbl> <dbl> #> 1 6 1 0 #> 2 6 1 0 #> 3 4 1 1 #> 4 6 0 1 #> # ... with 28 more rows
mtcars %>% select(mpg:disp) #> # A tibble: 32 x 3 #> mpg cyl disp #> <dbl> <dbl> <dbl> #> 1 21 6 160 #> 2 21 6 160 #> 3 22.8 4 108 #> 4 21.4 6 258 #> # ... with 28 more rows
For historical reasons, it is also possible to refer an external vector of variable names. You get the correct result, but with a note informing you that selecting with an external variable is ambiguous because it is not clear whether you want a data frame column or an external object.
vars <- c("cyl", "am", "vs") result <- mtcars %>% select(vars) #> Note: Using an external vector in selections is ambiguous. #> i Use `all_of(vars)` instead of `vars` to silence this message. #> i See <https://tidyselect.r-lib.org/reference/faq-external-vector.html>. #> This message is displayed once per session.
This note will become a warning in the future, and then an error. We have decided to deprecate this particular approach to using external vectors because they introduce ambiguity. Imagine that the data frame contains a column with the same name as your external variable.
some_df <- mtcars some_df$vars <- 1:nrow(mtcars)
These are very different objects but it isn<U+2019>t a problem if the context
forces you to be specific about where to find vars
:
vars #> [1] "cyl" "am" "vs"
some_df$vars #> [1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 #> [29] 29 30 31 32
In a selection context however, the column wins:
some_df %>% select(vars) #> # A tibble: 32 x 1 #> vars #> <int> #> 1 1 #> 2 2 #> 3 3 #> 4 4 #> # ... with 28 more rows
To make your selection code more robust and silence the message, use
all_of()
to force the external vector:
some_df %>% select(all_of(vars)) #> # A tibble: 32 x 3 #> cyl am vs #> <dbl> <dbl> <dbl> #> 1 6 1 0 #> 2 6 1 0 #> 3 4 1 1 #> 4 6 0 1 #> # ... with 28 more rows
For more information or if you have comments about this, please see the Github issue tracking the deprecation process.