Learn R Programming

expss (version 0.7.1)

modify: Modify data.frame/modify subset of the data.frame

Description

  • modify evaluates expression expr in the context of data.frame data and return original data possibly modified. It works similar to within in base R but try to return new variables in order of their occurrence in the expression and make available full-featured %to% and .N in the expressions. See vars.

  • calculate evaluates expression expr in the context of data.frame data and return value of the evaluated expression. It works similar to with in base R but make available full-featured %to% and .N in the expressions. See vars.

  • modify_if modifies only rows for which cond equals to TRUE. Other rows remain unchanged. Newly created variables also will have values only in rows for which cond have TRUE. There will be NA's in other rows. This function tries to mimic SPSS "DO IF(). ... END IF." statement.

There is a special constant .N which equals to number of cases in data for usage in expression inside modify/calculate. Inside modify_if .N gives number of rows which will be affected by expressions. Inside these functions you can use set function which creates variables with given name/set values to existing variables - .set. It is possible with set to assign values to multiple variables at once. compute is an alias for modify, do_if is an alias for modify_if and calc is an alias for calculate.

Usage

modify(data, expr)

data %modify% expr

compute(data, expr)

data %compute% expr

modify_if(data, cond, expr)

do_if(data, cond, expr)

calculate(data, expr)

data %calculate% expr

calc(data, expr)

data %calc% expr

Arguments

data

data.frame/list of data.frames. If data is list of data.frames then expression expr will be evaluated inside each data.frame separately.

expr

expression that should be evaluated in the context of data.frame data

cond

logical vector or expression. Expression will be evaluated in the context of the data.

Value

modify and modify_if functions return modified data.frame/list of modified data.frames, calculate returns value of the evaluated expression/list of values.

Examples

Run this code
# NOT RUN {
dfs = data.frame(
    test = 1:5,
    aa = rep(10, 5),
    b_ = rep(20, 5),
    b_1 = rep(11, 5),
    b_2 = rep(12, 5),
    b_3 = rep(13, 5),
    b_4 = rep(14, 5),
    b_5 = rep(15, 5) 
)


# compute sum of b* variables and attach it to 'dfs'
modify(dfs, {
    b_total = sum_row(b_, b_1 %to% b_5)
    var_lab(b_total) = "Sum of b"
    random_numbers = runif(.N) # .N usage
})

# calculate sum of b* variables and return it
calculate(dfs, sum_row(b_, b_1 %to% b_5))

# 'set' function
# new variables filled with NA
modify(dfs, {
    set('new_b`1:5`')
})

# 'set' function
# set values to existing/new variables
# expression in backticks will be expanded - see ?subst
modify(dfs, {
    set('new_b`1:5`', b_1 %to% b_5)
})


# conditional modification
modify_if(dfs, test %in% 2:4, {
    aa = aa + 1    
    a_b = aa + b_    
    b_total = sum_row(b_, b_1 %to% b_5)
    random_numbers = runif(.N) # .N usage
})

# }

Run the code above in your browser using DataLab