Item-Missingness (also referred to as item nonresponse (De Leeuw et al. 2003)) describes the missingness of single values, e.g. blanks or empty data cells in a data set. Item-Missingness occurs for example in case a respondent does not provide information for a certain question, a question is overlooked by accident, a programming failure occurs or a provided answer were missed while entering the data.
Indicator
com_item_missingness(
study_data,
meta_data,
resp_vars = NULL,
label_col,
show_causes = TRUE,
cause_label_df,
include_sysmiss = TRUE,
threshold_value,
suppressWarnings = FALSE,
assume_consistent_codes = TRUE,
expand_codes = assume_consistent_codes,
drop_levels = TRUE,
expected_observations = c("HIERARCHY", "ALL", "SEGMENT"),
pretty_print = lifecycle::deprecated()
)
a list with:
SummaryTable
: data frame about item missingness per response variable
SummaryData
: data frame about item missingness per response variable
formatted for user
SummaryPlot
: ggplot2 heatmap plot, if show_causes was TRUE
ReportSummaryTable
: data frame underlying SummaryPlot
data.frame the data frame that contains the measurements
data.frame the data frame that contains metadata attributes of study data
variable list the name of the measurement variables
variable attribute the name of the column in the metadata with labels of variables
logical if TRUE, then the distribution of missing codes is shown
data.frame missing code table. If missing codes have labels the respective data frame can be specified here or in the metadata as assignments, see cause_label_df
logical Optional, if TRUE system missingness (NAs) is evaluated in the summary plot
numeric from=0 to=100. a numerical value ranging from 0-100
logical warn about consistency issues with missing and jump lists
logical if TRUE and no labels are given and the same missing/jump code is used for more than one variable, the labels assigned for this code are treated as being be the same for all variables.
logical if TRUE, code labels are copied from other variables, if the code is the same and the label is set somewhere
logical if TRUE, do not display unused missing codes in the figure legend.
enum HIERARCHY | ALL | SEGMENT. If ALL, all
observations are expected to comprise
all study segments. If SEGMENT, the
PART_VAR
is expected to point
to a variable with values of 0 and 1,
indicating whether the variable was
expected to be observed for each data
row. If HIERARCHY, this is also
checked recursively, so, if a variable
points to such a participation variable,
and that other variable does has also
a PART_VAR
entry pointing
to a variable, the observation of the
initial variable is only
expected, if both segment variables are
1.
logical deprecated. If you want to have a human
readable output, use SummaryData
instead
of SummaryTable
Lists of missing codes and, if applicable, jump codes are selected from the metadata
The no. of system missings (NA) in each variable is calculated
The no. of used missing codes is calculated for each variable
The no. of used jump codes is calculated for each variable
Two result dataframes (1: on the level of observations, 2: a summary for each variable) are generated
OPTIONAL: if show_causes
is selected, one summary plot for all
resp_vars
is provided