Learn R Programming

sjmisc (version 2.6.3)

std: Standardize and center variables

Description

std() computes a z-transformation (standardized and centered) on the input. center() centers the input.

Usage

std(x, ..., robust = c("sd", "gmd", "mad"), include.fac = FALSE,
  append = FALSE, suffix = "_z")

center(x, ..., include.fac = FALSE, append = FALSE, suffix = "_c")

Arguments

x

A vector or data frame.

...

Optional, unquoted names of variables that should be selected for further processing. Required, if x is a data frame (and no vector) and only selected variables from x should be processed. You may also use functions like : or dplyr's select_helpers. See 'Examples' or package-vignette.

robust

Character vector, indicating the method applied when standardizing variables with std(). By default, standardization is achieved by dividing the centered variables by their standard deviation (robust = "sd"). However, for skewed distributions, the median absolute deviation (MAD, robust = "mad") or Gini's mean difference (robust = "gmd") might be more robust measures of dispersion. For the latter option, sjstats needs to be installed.

include.fac

Logical, if TRUE, factors will be converted to numeric vectors and also standardized or centered.

append

Logical, if TRUE and x is a data frame, x including the new variables as additional columns is returned; if FALSE (the default), only the new variables are returned.

suffix

String value, will be appended to variable (column) names of x, if x is a data frame. If x is not a data frame, this argument will be ignored. The default value to suffix column names in a data frame depends on the function call:

  • recoded variables (rec()) will be suffixed with "_r"

  • recoded variables (recode_to()) will be suffixed with "_r0"

  • dichotomized variables (dicho()) will be suffixed with "_d"

  • grouped variables (split_var()) will be suffixed with "_g"

  • grouped variables (group_var()) will be suffixed with "_gr"

  • standardized variables (std()) will be suffixed with "_z"

  • centered variables (center()) will be suffixed with "_c"

Value

A vector with standardized or centered variables. If x is a data frame, only the transformed variables will be returned.

Details

std() and center() also work on grouped data frames (see group_by). In this case, standardization or centering is applied to the subsets of variables in x. See 'Examples'.

Examples

Run this code
# NOT RUN {
data(efc)
std(efc$c160age) %>% head()
std(efc, e17age, c160age) %>% head()

center(efc$c160age) %>% head()
center(efc, e17age, c160age) %>% head()

# NOTE!
std(efc$e17age) # returns a vector
std(efc, e17age) # returns a tibble

# works with mutate()
library(dplyr)
efc %>%
  select(e17age, neg_c_7) %>%
  mutate(age_std = std(e17age), burden = center(neg_c_7)) %>%
  head()

# works also with grouped data frames
mtcars %>% std(disp)

mtcars %>%
  group_by(cyl) %>%
  std(disp)

data(iris)
# also standardize factors
std(iris, include.fac = TRUE)
# don't standardize factors
std(iris, include.fac = FALSE)

# }

Run the code above in your browser using DataLab