identifyLoners: A checkFunction for identifying sparsely represented values (loners)
Description
A checkFunction to be called from check that identifies values that
only occur less than 6 times in factor, (haven_)labelled, or character variables (that is, loners).
Usage
identifyLoners(v, nMax = 10)
Value
A checkResult with three entires:
$problem (a logical indicating whether case issues where found),
$message (a message describing which values in v were loners) and
$problemValues (the problematic values in their original format).
Note that Only unique problematic values
are listed and they are presented in alphabetical order.
Arguments
v
A character, (haven_)labelled, or factor variable to check.
nMax
The maximum number of problematic values to report.
Default is 10. Set to Inf if all problematic values are to be included
in the outputted message, or to 0 for no output.
Details
For character, (haven_)labelled, and factor variables, identify values that only have a
very low number of observations, as these categories might be
problematic when conducting an analysis. Unused factor levels are
not considered "loners". "Loners" are defined as values with 5 or less
observations, reflecting the commonly use rule of thumb for performing
chi squared tests.