Learn R Programming

sjmisc (version 2.6.3)

row_sums: Row sums and means for data frames

Description

row_sums() simply wraps rowSums, while row_means() simply wraps mean_n, however, the argument-structure of both functions is designed to work nicely within a pipe-workflow and allows select-helpers for selecting variables, the default for na.rm is TRUE, and the return value is always a tibble (with one variable).

Usage

row_sums(x, ..., na.rm = TRUE, var = "rowsums", append = FALSE)

row_means(x, ..., n, var = "rowmeans", append = FALSE)

Arguments

x

A vector or data frame.

...

Optional, unquoted names of variables that should be selected for further processing. Required, if x is a data frame (and no vector) and only selected variables from x should be processed. You may also use functions like : or dplyr's select_helpers. See 'Examples' or package-vignette.

na.rm

Logical, TRUE if missing values should be omitted from the calculations.

var

Name of new the variable with the row sums or means.

append

Logical, if TRUE and x is a data frame, x including the new variables as additional columns is returned; if FALSE (the default), only the new variables are returned.

n

May either be

  • a numeric value that indicates the amount of valid values per row to calculate the row mean;

  • or a value between 0 and 1, indicating a proportion of valid values per row to calculate the row mean (see 'Details').

If a row's sum of valid values is less than n, NA will be returned as row mean value.

Value

For row_sums(), a tibble with one variable: the row sums from x; for row_means(), a tibble with one variable: the row means from x.

Details

For n, must be a numeric value from 0 to ncol(x). If a row in x has at least n non-missing values, the row mean is returned. If n is a non-integer value from 0 to 1, n is considered to indicate the proportion of necessary non-missing values per row. E.g., if n = .75, a row must have at least ncol(x) * n non-missing values for the row mean to be calculated. See 'Examples'.

Examples

Run this code
# NOT RUN {
data(efc)
efc %>% row_sums(c82cop1:c90cop9)

library(dplyr)
row_sums(efc, contains("cop"))

dat <- data.frame(
  c1 = c(1,2,NA,4),
  c2 = c(NA,2,NA,5),
  c3 = c(NA,4,NA,NA),
  c4 = c(2,3,7,8),
  c5 = c(1,7,5,3)
)
dat

row_means(dat, n = 4)
row_means(dat, c1:c4, n = 4)
# at least 40% non-missing
row_means(dat, c1:c4, n = .4)

# create sum-score of COPE-Index, and append to data
efc %>%
  select(c82cop1:c90cop9) %>%
  row_sums() %>%
  add_columns(efc)

# }

Run the code above in your browser using DataLab