Learn R Programming

expss (version 0.5.5)

if_val: Change, rearrange or consolidate the values of an existing/new variable. Inspired by RECODE command from SPSS.

Description

if_val change, rearrange or consolidate the values of an existing variable based on conditions. Design of this function inspired by RECODE from SPSS. Sequence of recodings provided in the form of formulas. For example, 1:2 ~ 1 means that all 1's and 2's will be replaced with 1. Each value will recoded only once. In the assignment form if_val(...) = ... of this function values which doesn't meet any condition remain unchanged. In case of the usual form ... = if_val(...) values which doesn't meet any condition will be replaced with NA. As a condition one can use just values or more sophisticated logical values and functions. There are several special functions for usage as criteria - for details see criteria. Simple common usage looks like: if_val(x, 1:2 ~ -1, 3 ~ 0, 1:2 ~ 1, 99 ~ NA). For more information, see details and examples. The ifs function checks whether one or more conditions are met and returns a value that corresponds to the first TRUE condition. ifs can take the place of multiple nested ifelse statements and is much easier to read with multiple conditions. ifs works in the same manner as if_val - e. g. with formulas or with from/to notation. But conditions should be only logical and it doesn't operate on multicolumn objects.

Usage

if_val(x, ..., from = NULL, to = NULL)
if_val(x, from = NULL) <- value
ifs(..., from = NULL, to = NULL, default = NA)
lo
hi
copy(x)

Arguments

x
vector/matrix/data.frame/list
...
sequence of formulas which describe recodings. They are used when from/to arguments are not provided.
from
list of conditions for values which should be recoded (in the same format as LHS of formulas).
to
list of values into which old values should be recoded (in the same format as RHS of formulas).
value
list with formulas which describe recodings in assignment form of function/to list if from/to notation is used.
default
single value or vector. Default value - NA. This value will be used for values of result with all conditions FALSE/NA.

Value

object of same form as x with recoded values

Format

An object of class numeric of length 1.

Details

Input conditions - possible values for left hand side (LHS) of formula or element of from list:
  • vector/single value All values in x which equal to elements of vector in LHS will be replaced with RHS.
  • function Values for which function gives TRUE will be replaced with RHS. There are some special functions for convenience - see criteria. One of special functions is other. It means all other unrecoded values (ELSE in SPSS RECODE). All other unrecoded values will be changed to RHS of formula or appropriate element of to.
  • logical vector/matrix/data.frame Values for which LHS equals to TRUE will be recoded. Logical vector will be recycled across all columns of x. If LHS is matrix/data.frame then column from this matrix/data.frame will be used for corresponding column/element of x.

Output values - possible values for right hand side (RHS) of formula or element of to list:

  • value replace elements of x. This value will be recycled across rows and columns of x.
  • vector values of this vector will be replace values in corresponding position in rows of x. Vector will be recycled across columns of x.
  • list/matrix/data.frame Element of list/column of matrix/data.frame will be used as a replacement value for corresponding column/element of x.
  • function This function will be applied to values of x which satisfy recoding condition.There is special auxiliary function copy which just returns its argument. So in the if_val it just copies old value (COPY in SPSS RECODE). See examples. copy is useful in the usual form of if_val and doesn't do anything in the case of the assignment form if_val() = ... because this form don't modify values which are not satisfying any of the conditions.

lo and hi are shortcuts for -Inf and Inf. They can be useful in expressions with %thru%, e. g. 1 %thru% hi.

Examples

Run this code
# `ifs` examples
a = 1:5
b = 5:1
ifs(b>3 ~ 1)                       # c(1, 1, NA, NA, NA)
ifs(b>3 ~ 1, default = 3)          # c(1, 1, 3, 3, 3)
ifs(b>3 ~ 1, a>4 ~ 7, default = 3) # c(1, 1, 3, 3, 7)
ifs(b>3 ~ a, default = 42)         # c(1, 2, 42, 42, 42)
# some examples from SPSS manual
# RECODE V1 TO V3 (0=1) (1=0) (2, 3=-1) (9=9) (ELSE=SYSMIS)
set.seed(123)
v1  = sample(c(0:3, 9, 10), 20, replace = TRUE)
if_val(v1) = c(0 ~ 1, 1 ~ 0, 2:3 ~ -1, 9 ~ 9, other ~ NA)
v1

# RECODE QVAR(1 THRU 5=1)(6 THRU 10=2)(11 THRU HI=3)(ELSE=0).
set.seed(123)
qvar = sample((-5):20, 50, replace = TRUE)
if_val(qvar, 1 %thru% 5 ~ 1, 6 %thru% 10 ~ 2, 11 %thru% hi ~ 3, other ~ 0)
# the same result
if_val(qvar, 1 %thru% 5 ~ 1, 6 %thru% 10 ~ 2, gte(11) ~ 3, other ~ 0)

# RECODE STRNGVAR ('A', 'B', 'C'='A')('D', 'E', 'F'='B')(ELSE=' '). 
strngvar = LETTERS
if_val(strngvar, c('A', 'B', 'C') ~ 'A', c('D', 'E', 'F') ~ 'B', other ~ ' ')

# RECODE AGE (MISSING=9) (18 THRU HI=1) (0 THRU 18=0) INTO VOTER. 
set.seed(123)
age = sample(c(sample(5:30, 40, replace = TRUE), rep(9, 10)))
voter = if_val(age, NA ~ 9, 18 %thru% hi ~ 1, 0 %thru% 18 ~ 0)
voter

# example with function in RHS
set.seed(123)
a = rnorm(20)
# if a<(-0.5) we change it to absolute value of a (abs function)
if_val(a, lt(-0.5) ~ abs, other ~ copy) 

# the same example with logical criteria
if_val(a, a<(-.5) ~ abs, other ~ copy) 

# replace with specific value for each column
# we replace values greater than 0.75 with column max and values less than 0.25 with column min
# and NA with column means
# make data.frame
set.seed(123)
x1 = runif(30)
x2 = runif(30)
x3 = runif(30)
x1[sample(30, 10)] = NA # place 10 NA's
x2[sample(30, 10)] = NA # place 10 NA's
x3[sample(30, 10)] = NA # place 10 NA's
dfs = data.frame(x1, x2, x3)

#replacement. Note the necessary transpose operation
if_val(dfs, 
        lt(0.25) ~ t(min_col(dfs)), 
        gt(0.75) ~ t(max_col(dfs)), 
        NA ~ t(mean_col(dfs)), 
        other ~ copy
      )

# replace NA with row means
# some rows which contain all NaN remain unchanged because mean_row for them also is NaN
if_val(dfs, NA ~ mean_row(dfs), other ~ copy) 

# some of the above examples with from/to notation

set.seed(123)
v1  = sample(c(0:3,9,10), 20, replace = TRUE)
# RECODE V1 TO V3 (0=1) (1=0) (2,3=-1) (9=9) (ELSE=SYSMIS)
fr = list(0, 1, 2:3, 9, other)
to = list(1, 0, -1, 9, NA)
if_val(v1, from = fr) = to
v1

# RECODE QVAR(1 THRU 5=1)(6 THRU 10=2)(11 THRU HI=3)(ELSE=0).
fr = list(1 %thru% 5, 6 %thru% 10, gte(11), other)
to = list(1, 2, 3, 0)
if_val(qvar, from = fr, to = to)

# RECODE STRNGVAR ('A','B','C'='A')('D','E','F'='B')(ELSE=' ').
fr = list(c('A','B','C'), c('D','E','F') , other)
to = list("A", "B", " ")
if_val(strngvar, from = fr, to = to)

# RECODE AGE (MISSING=9) (18 THRU HI=1) (0 THRU 18=0) INTO VOTER.
fr = list(NA, 18 %thru% hi, 0 %thru% 18)
to = list(9, 1, 0)
voter = if_val(age, from = fr, to = to)
voter

Run the code above in your browser using DataLab