freqs
creates a frequency table for a set of variables in a
data.frame. Depending on total
, frequencies for all the variables
together can be returned. The function probably makes the most sense for sets
of variables with similar unique values (e.g., items from a questionnaire
with similar response options).
freqs(data, vrb.nm, prop = FALSE, useNA = "always", total = "no")
data.frame of frequencies for the variables in data[vrb.nm]
.
Depending on prop
, the frequencies are either counts (FALSE) or
proportions (TRUE). Depending on total
, the nrow is either 1)
length(vrb.nm)
(if total
= "no"), 1 + length(vrb.nm)
(if total
= "yes"), or 3) 1 (if total
= "only"). The rownames
are vrb.nm
for each variable in data[vrb.nm]
and "_total_"
for the total row (if present). The colnames are the unique values present
in data[vrb.nm]
, potentially including "(NA)" depending on
useNA
.
data.fame of data.
character vector of colnames from data
specifying the
variables.
logical vector of length 1 specifying whether the frequencies
should be counts (FALSE) or proportions (TRUE). Note, whether the
proportions include missing values depends on the useNA
argument.
character vector of length 1 specifying how missing values
should be handled. The three options are 1) "no" = do not include NA
frequencies in the return object, 2) "ifany" = only NA frequencies if there
are any missing values (in any variable from data[vrb.nm]
), or 3)
"always" = do include NA frequencies regardless of whether there are
missing values or not.
character vector of length 1 specifying whether the frequencies
for the set of variables as a whole should be returned. The name "total"
refers to tabulating the frequencies for the variables from
data[vrb.nm]
together as a set. The three options are 1) "no" = do
not include a row for the total frequencies in the return object, 2) "yes"
= do include the total frequencies as the first row in the return object,
or 3) "only" = only include the total frequencies as a single row in the
return object and do not include rows for each of the individual column
frequencies in data[vrb.nm]
.
freqs
uses plyr::rbind.fill
to combine the results from
table
applied to each variable into a single data.frame. If a variable
from data[vrb.nm]
does not have values present in other variables from
data[vrb.nm]
, then the frequencies in the return object will be 0.
The name for the table element giving the frequency of missing values is
"(NA)". This is different from table
where the name is
NA_character_
. This change allows for the sorting of tables that
include missing values, as subsetting in R is not possible with
NA_character_
names. In future versions of the package, this might
change as it should be possible to avoid this issue by subetting with a
logical vector or integer indices instead of names. However, it is convenient
to be able to subset the return object fully by names.
freq
freqs_by
freq_by
table
vrb_nm <- str2str::inbtw(names(psych::bfi), "A1","O5")
freqs(data = psych::bfi, vrb.nm = vrb_nm) # default
freqs(data = psych::bfi, vrb.nm = vrb_nm, prop = TRUE) # proportions by row
freqs(data = psych::bfi, vrb.nm = vrb_nm, useNA = "no") # without NA counts
freqs(data = psych::bfi, vrb.nm = vrb_nm, total = "yes") # include total counts
Run the code above in your browser using DataLab