freqs: Multiple Univariate Frequency Tables

Description

freqs creates a frequency table for a set of variables in a data.frame. Depending on total, frequencies for all the variables together can be returned. The function probably makes the most sense for sets of variables with similar unique values (e.g., items from a questionnaire with similar response options).

Usage

freqs(data, vrb.nm, prop = FALSE, useNA = "always", total = "no")

Value

data.frame of frequencies for the variables in data[vrb.nm]. Depending on prop, the frequencies are either counts (FALSE) or proportions (TRUE). Depending on total, the nrow is either 1)

length(vrb.nm) (if total = "no"), 1 + length(vrb.nm)

(if total = "yes"), or 3) 1 (if total = "only"). The rownames are vrb.nm for each variable in data[vrb.nm] and "_total_" for the total row (if present). The colnames are the unique values present in data[vrb.nm], potentially including "(NA)" depending on

useNA.

Arguments

data: data.fame of data.
vrb.nm: character vector of colnames from data specifying the variables.
prop: logical vector of length 1 specifying whether the frequencies should be counts (FALSE) or proportions (TRUE). Note, whether the proportions include missing values depends on the useNA argument.
useNA: character vector of length 1 specifying how missing values should be handled. The three options are 1) "no" = do not include NA frequencies in the return object, 2) "ifany" = only NA frequencies if there are any missing values (in any variable from data[vrb.nm]), or 3) "always" = do include NA frequencies regardless of whether there are missing values or not.
total: character vector of length 1 specifying whether the frequencies for the set of variables as a whole should be returned. The name "total" refers to tabulating the frequencies for the variables from data[vrb.nm] together as a set. The three options are 1) "no" = do not include a row for the total frequencies in the return object, 2) "yes" = do include the total frequencies as the first row in the return object, or 3) "only" = only include the total frequencies as a single row in the return object and do not include rows for each of the individual column frequencies in data[vrb.nm].

Details

freqs uses plyr::rbind.fill to combine the results from table applied to each variable into a single data.frame. If a variable from data[vrb.nm] does not have values present in other variables from data[vrb.nm], then the frequencies in the return object will be 0.

The name for the table element giving the frequency of missing values is "(NA)". This is different from table where the name is NA_character_. This change allows for the sorting of tables that include missing values, as subsetting in R is not possible with NA_character_ names. In future versions of the package, this might change as it should be possible to avoid this issue by subetting with a logical vector or integer indices instead of names. However, it is convenient to be able to subset the return object fully by names.

Examples

Run this code

vrb_nm <- str2str::inbtw(names(psych::bfi), "A1","O5")
freqs(data = psych::bfi, vrb.nm = vrb_nm) # default
freqs(data = psych::bfi, vrb.nm = vrb_nm, prop = TRUE) # proportions by row
freqs(data = psych::bfi, vrb.nm = vrb_nm, useNA = "no") # without NA counts
freqs(data = psych::bfi, vrb.nm = vrb_nm, total = "yes") # include total counts