This function computes a summary of missing data patterns, i.e., number ( cases with a specific missing data pattern and plots the missing data patterns.
na.pattern(..., data = NULL, order = FALSE, n.pattern = NULL, plot = FALSE,
square = TRUE, rotate = FALSE, fill.col = c("#B61A51B3", "#006CC2B3"),
alpha = 0.6, plot.margin = c(4, 16, 0, 4),
legend.box.margin = c(-8, 6, 6, 6), legend.key.size = 12,
legend.text.size = 9, saveplot = FALSE, file = "NA_Patternt.pdf",
width = NA, height = NA, units = c("in", "cm", "mm", "px"), dpi = 600,
digits = 2, as.na = NULL, write = NULL, append = TRUE, check = TRUE,
output = TRUE)
Returns an object of class misty.object
, which is a list with following
entries:
call
function call
type
type of analysis
data
list with data frames, i.e., data
for the data frame
with variables used in the current analysis, and plotdat
for the data frame used for plotting the results
args
specification of function arguments
result
result table
plot
ggplot2 object for plotting the results
pattern
a numeric vector indicating the missing data pattern for each case
a matrix or data frame with incomplete data, where missing
values are coded as NA
. a matrix or data frame with incomplete data, where missing
values are coded as NA
. Alternatively, an expression
indicating the variable names in data
e.g.,
na.pattern(x1, x2, x3, data = dat)
.Note that the operators
.
, +
, -
, ~
, :
, ::
,
and !
can also be used to select variables, see 'Details'
in the df.subset
function.
a data frame when specifying one or more variables in the
argument ...
. Note that the argument is NULL
when specifying a matrix or data frame for the argument ...
.
logical: if TRUE
, variables are ordered from left to
right in increasing order of missing values.
an integer value indicating the minimum number of cases sharing
a missing data pattern to be included in the result table and the plot, e.g., specifying
n.pattern = 5
excludes missing data patters with less than 5
cases.
logical: if TRUE
, missing data pattern is plotted.
logical: if TRUE
(default), the plot tiles are squares
to mimic the md.pattern
function in the package mice.
logical: if TRUE
, the variable name labels are rotated 90 degrees.
a character string indicating the color for the "fill"
argument.
Note that the first color represents missing values and the second color
represent observed values.
a numeric value between 0 and 1 for the alpha
argument (default is 0.1
.
a numeric vector indicating the plot.margin
argument for the theme
function.
a numeric vector indicating the legend.box.margin
argument for the theme
function.
a numeric value indicating the legend.key
argument (default is unit(12, "pt")
) for the theme
function.
a numeric value indicating the legend.text
argument (default is element_text(size = 10)
) for the theme
function.
logical: if TRUE
, the ggplot is saved.
a character string indicating the filename
argument (default is "NA_Pattern.pdf"
) including
the file extension for the ggsave
function. Note that one of ".eps"
, ".ps"
,
".tex"
, ".pdf"
(default), ".jpeg"
, ".tiff"
, ".png"
, ".bmp"
,
".svg"
or ".wmf"
needs to be specified as file extension in the file
argument.
a numeric value indicating the width
argument (default is the
size of the current graphics device) for the ggsave
function.
a numeric value indicating the height
argument
(default is the size of the current graphics device) for the ggsave
function.
a character string indicating the units
argument
(default is in
) for the ggsave
function.
a numeric value indicating the dpi
argument
(default is 600
) for the ggsave
function.
an integer value indicating the number of decimal places to be used for displaying percentages.
a numeric vector indicating user-defined missing values, i.e. these values are converted to NA before conducting the analysis.
a character string naming a file for writing the output into
either a text file with file extension ".txt"
(e.g.,
"Output.txt"
) or Excel file with file extension
".xlsx"
(e.g., "Output.xlsx"
). If the file
name does not contain any file extension, an Excel file will
be written.
logical: if TRUE
(default), output will be appended
to an existing text file with extension .txt
specified
in write
, if FALSE
existing text file will be
overwritten.
logical: if TRUE
(default), argument specification is checked.
logical: if TRUE
(default), output is shown.
Takuya Yanagida takuya.yanagida@univie.ac.at
Enders, C. K. (2010). Applied missing data analysis. Guilford Press.
Graham, J. W. (2009). Missing data analysis: Making it work in the real world. Annual Review of Psychology, 60, 549-576. https://doi.org/10.1146/annurev.psych.58.110405.085530
Oberman, H. (2023). ggmice: Visualizations for 'mice' with 'ggplot2'. R package version 0.1.0. https://doi.org/10.32614/CRAN.package.ggmice
van Buuren, S. (2018). Flexible imputation of missing data (2nd ed.). Chapman & Hall.
write.result
, as.na
, na.as
,
na.auxiliary
, na.coverage
, na.descript
,
na.indicator
, na.prop
, na.test