Learn R Programming

sjmisc (version 2.3.0)

rec: Recode variables

Description

Recodes the categories / values of a variable x into new category values.

Usage

rec(x, ..., recodes, as.num = TRUE, var.label = NULL, val.labels = NULL, suffix = "_r")

Arguments

x
A vector or data frame.
...
Optional, unquoted names of variables. Required, if x is a data frame (and no vector) and only selected variables from x should be processed. You may also use functions like : or dplyr's select_helpers. The latter must be stated as formula (i.e. beginning with ~). See 'Examples' or package-vignette.
recodes
String with recode pairs of old and new values. See 'Details' for examples. rec_pattern is a convenient function to create recode strings for grouping variables.
as.num
Logical, if TRUE, return value will be numeric, not a factor.
var.label
Optional string, to set variable label attribute for the returned variable (see set_label). If NULL (default), variable label attribute of x will be used (if present). If empty, variable label attributes will be removed.
val.labels
Optional character vector, to set value label attributes of recoded variable (see set_labels). If NULL (default), no value labels will be set. Value labels can also be directly defined in the recodes-syntax, see 'Details'.
suffix
String value, will be appended to variable (column) names of x, if x is a data frame. If x is not a data frame, this argument will be ignored. The default value to suffix column names in a data frame depends on the function call:
  • recoded variables (rec()) will be suffixed with "_r"
  • recoded variables (recode_to()) will be suffixed with "_r0"
  • dichotomized variables (dicho()) will be suffixed with "_d"
  • grouped variables (split_var()) will be suffixed with "_g"
  • grouped variables (group_var()) will be suffixed with "_gr"
  • standardized variables (std()) will be suffixed with "_z"
  • centered variables (center()) will be suffixed with "_c"

Value

x with recoded categories. If x is a data frame, only the recoded variables will be returned.

Details

The recodes string has following syntax:

See Also

set_na for setting NA values, replace_na to replace NA's with specific value, recode_to for re-shifting value ranges and ref_lvl to change the reference level of (numeric) factors.

Examples

Run this code
data(efc)
table(efc$e42dep, useNA = "always")

# replace NA with 5
table(rec(efc$e42dep, recodes = "1=1;2=2;3=3;4=4;NA=5"), useNA = "always")

# recode 1 to 2 into 1 and 3 to 4 into 2
table(rec(efc$e42dep, recodes = "1,2=1; 3,4=2"), useNA = "always")

# or:
# rec(efc$e42dep) <- "1,2=1; 3,4=2"
# table(efc$e42dep, useNA = "always")

# keep value labels. variable label is automatically preserved
library(dplyr)
efc %>%
  select(e42dep) %>%
  rec(recodes = "1,2=1; 3,4=2",
      val.labels = c("low dependency", "high dependency")) %>%
  str()

# works with mutate
efc %>%
  select(e42dep, e17age) %>%
  mutate(dependency_rev = rec(e42dep, recodes = "rev")) %>%
  head()

# recode 1 to 3 into 4 into 2
table(rec(efc$e42dep, recodes = "min:3=1; 4=2"), useNA = "always")

# recode 2 to 1 and all others into 2
table(rec(efc$e42dep, recodes = "2=1; else=2"), useNA = "always")

# reverse value order
table(rec(efc$e42dep, recodes = "rev"), useNA = "always")

# recode only selected values, copy remaining
table(efc$e15relat)
table(rec(efc$e15relat, recodes = "1,2,4=1; else=copy"))

# recode variables with same category in a data frame
head(efc[, 6:9])
head(rec(efc[, 6:9], recodes = "1=10;2=20;3=30;4=40"))

# recode multiple variables and set value labels via recode-syntax
dummy <- rec(efc, c160age, e17age,
             recodes = "15:30=1 [young]; 31:55=2 [middle]; 56:max=3 [old]")
frq(dummy)

# recode variables with same value-range
lapply(
  rec(efc, c82cop1, c83cop2, c84cop3, recodes = "1,2=1; NA=9; else=copy"),
  table,
  useNA = "always"
)

# recode character vector
dummy <- c("M", "F", "F", "X")
rec(dummy, recodes = "M=Male; F=Female; X=Refused")

# recode non-numeric factors
data(iris)
table(rec(iris, Species, recodes = "setosa=huhu; else=copy"))

# preserve tagged NAs
library(haven)
x <- labelled(c(1:3, tagged_na("a", "c", "z"), 4:1),
              c("Agreement" = 1, "Disagreement" = 4, "First" = tagged_na("c"),
                "Refused" = tagged_na("a"), "Not home" = tagged_na("z")))
# get current value labels
x
# recode 2 into 5; Values of tagged NAs are preserved
rec(x, recodes = "2=5;else=copy")
na_tag(rec(x, recodes = "2=5;else=copy"))

# use select-helpers from dplyr-package
rec(efc, ~contains("cop"), c161sex:c175empl, recodes = "0,1=0; else=1")


Run the code above in your browser using DataLab