Learn R Programming

collapse (version 1.1.0)

AA1-recode-replace: Recode and Replace Values in Matrix-Like Objects

Description

  • Recode can be used to replace multiple values in vectors, matrices or data.frames, using either exact (==) or regular expression matching.

  • replace_non_finite replaces NaN/Inf/-Inf (or optionally only Inf/-Inf) with a value (default is NA).

  • replace_outliers replaces values falling outside a 1- or 2-sided numeric threshold or outside a certain number of column- standard deviations with a value (default is NA).

Usage

Recode(X, ..., copy = FALSE, reserve.na.nan = TRUE, regex = FALSE)

replace_non_finite(X, value = NA, replace.nan = TRUE)

replace_outliers(X, limits, value = NA, single.limit = c("SDs","min","max"))

Arguments

X

a vector, matrix or data.frame.

...

comma-separated recode arguments of the form: name = newname, `2` = 0, `NaN` = 0, `NA` = 0, `Inf` = NA, `-Inf` = NA, etc...

limits

either a vector of two-numeric values c(minval, maxval) constituting a two-sided outlier threshold, or a single numeric value constituting either factor of standard deviations (default), or the minimum or maximum of a one-sided outlier threshold. See also single.limit.

value

a single (scalar) value to replace matching elements with. Default is NA.

copy

logical. For reciprocal or sequential replacements of the form a = b, b = c make a copy of X to prevent a being replaced with b and then all b-values being replaced with c again. In general Recode does the replacements one-after the other, starting with the first.

reserve.na.nan

logical. TRUE identifies NA and NaN as special numeric values and does the correct replacement. FALSE will treat NA/NaN as strings, and thus not match numeric NA/NaN. Note: This is not an issue for Inf/-Inf, which are matched in both numeric and character variables.

regex

logical. If TRUE, all recode-argument names are (sequentially) passed to grepl as a pattern to search X. All matches are replaced.

replace.nan

logical. TRUE (default) replaces NaN/Inf/-Inf. FALSE replaces only Inf/-Inf.

single.limit

a character or integer (only applies if length(limits) == 1):

  • 1 - "SDs" specifies that limits will be interpreted as a (two-sided) threshold in column standard-deviations. The underlying code is equivalent to X[abs(fscale(X)) > limits] <- value.

  • 2 - "min" specifies that limits will be interpreted as a (one-sided) minimum threshold. The underlying code is equivalent to X[X < limits] <- value.

  • 3 - "max" specifies that limits will be interpreted as a (one-sided) maximum threshold. The underlying code is equivalent to X[X > limits] <- value.

See Also

Small (Helper) Functions, Collapse Overview

Examples

Run this code
# NOT RUN {
Recode(c("a","b","c"), a = "b", b = "c")
Recode(c("a","b","c"), a = "b", b = "c", copy = TRUE)
Recode(c("a","b","c"), a = "b", b = "a", copy = TRUE)
Recode(month.name, ber = NA, regex = TRUE)
mtcr <- Recode(mtcars, `0` = 2, `4` = Inf, `1` = NaN)
replace_non_finite(mtcr)
replace_non_finite(mtcr, replace.nan = FALSE)
replace_outliers(mtcars, c(2, 100))                 # replace all values below 2 and above 100 w. NA
replace_outliers(mtcars, 2, single.limit = "min")   # replace all value smaller than 2 with NA
replace_outliers(mtcars, 100, single.limit = "max") # replace all value larger than 100 with NA
replace_outliers(mtcars, 2)                         # replace all values above or below 2 column-
                                                    # standard-deviations from the column-mean w. NA
replace_outliers(                                   # Passing a grouped_df, pseries or pdata.frame
num_vars(dplyr::group_by(iris, Species)), 2)        # allows to remove outliers according to
                                                    # in-group standard-deviation. see ?fscale

# }

Run the code above in your browser using DataLab