Learn R Programming

misty (version 0.7.1)

na.pattern: Missing Data Pattern

Description

This function computes a summary of missing data patterns, i.e., number ( cases with a specific missing data pattern and plots the missing data patterns.

Usage

na.pattern(data, ..., order = FALSE, n.pattern = NULL, digits = 2, as.na = NULL,
           plot = FALSE, square = TRUE, rotate = FALSE,
           color = c("#B61A51B3", "#006CC2B3"), tile.alpha = 0.6,
           plot.margin = c(4, 16, 0, 4), legend.box.margin = c(-8, 6, 6, 6),
           legend.key.size = 12, legend.text.size = 9, filename = NULL,
           width = NA, height = NA, units = c("in", "cm", "mm", "px"),
           dpi = 600, write = NULL, append = TRUE, check = TRUE, output = TRUE)

Value

Returns an object of class misty.object, which is a list with following entries:

call

function call

type

type of analysis

data

data frame with variables used in the analysis

args

specification of function arguments

result

result table

plot

ggplot2 object for plotting the results

pattern

a numeric vector indicating the missing data pattern for each case

Arguments

data

a data frame with incomplete data, where missing values are coded as NA.

...

an expression indicating the variable names in data e.g., na.pattern(dat, x1, x2, x3). Note that the operators ., +, -, ~, :, ::, and ! can also be used to select variables, see 'Details' in the df.subset function.

order

logical: if TRUE, variables are ordered from left to right in increasing order of missing values.

n.pattern

an integer value indicating the minimum number of cases sharing a missing data pattern to be included in the result table and the plot, e.g., specifying n.pattern = 5 excludes missing data patters with less than 5 cases.

digits

an integer value indicating the number of decimal places to be used for displaying percentages.

as.na

a numeric vector indicating user-defined missing values, i.e. these values are converted to NA before conducting the analysis.

plot

logical: if TRUE, missing data pattern is plotted.

square

logical: if TRUE (default), the plot tiles are squares to mimic the md.pattern function in the package mice.

rotate

logical: if TRUE, the variable name labels are rotated 90 degrees.

color

a character string indicating the color for the "fill" argument. Note that the first color represents missing values and the second color represent observed values.

tile.alpha

a numeric value between 0 and 1 for the alpha argument (default is 0.1).

plot.margin

a numeric vector indicating the plot.margin argument for the theme function.

legend.box.margin

a numeric vector indicating the legend.box.margin argument for the theme function.

legend.key.size

a numeric value indicating the legend.key argument (default is unit(12, "pt")) for the theme function.

legend.text.size

a numeric value indicating the legend.text argument (default is element_text(size = 10)) for the theme function.

filename

a character string indicating the filename argument (default is "NA_Pattern.pdf") including the file extension for the ggsave function. Note that one of ".eps", ".ps", ".tex", ".pdf" (default), ".jpeg", ".tiff", ".png", ".bmp", ".svg" or ".wmf" needs to be specified as file extension in the file argument.

width

a numeric value indicating the width argument (default is the size of the current graphics device) for the ggsave function.

height

a numeric value indicating the height argument (default is the size of the current graphics device) for the ggsave function.

units

a character string indicating the units argument (default is in) for the ggsave function.

dpi

a numeric value indicating the dpi argument (default is 600) for the ggsave function.

write

a character string naming a file for writing the output into either a text file with file extension ".txt" (e.g., "Output.txt") or Excel file with file extension ".xlsx" (e.g., "Output.xlsx"). If the file name does not contain any file extension, an Excel file will be written.

append

logical: if TRUE (default), output will be appended to an existing text file with extension .txt specified in write, if FALSE existing text file will be overwritten.

check

logical: if TRUE (default), argument specification is checked.

output

logical: if TRUE (default), output is shown.

Author

Takuya Yanagida takuya.yanagida@univie.ac.at

References

Enders, C. K. (2010). Applied missing data analysis. Guilford Press.

Graham, J. W. (2009). Missing data analysis: Making it work in the real world. Annual Review of Psychology, 60, 549-576. https://doi.org/10.1146/annurev.psych.58.110405.085530

Oberman, H. (2023). ggmice: Visualizations for 'mice' with 'ggplot2'. R package version 0.1.0. https://doi.org/10.32614/CRAN.package.ggmice

van Buuren, S. (2018). Flexible imputation of missing data (2nd ed.). Chapman & Hall.

See Also

write.result, as.na, na.as, na.auxiliary, na.coverage, na.descript, na.indicator, na.prop, na.test

Examples

Run this code
# Example 1: Compute a summary of missing data patterns
dat.pattern <- na.pattern(airquality)

# Example 2a: Compute and plot a summary of missing data patterns
na.pattern(airquality, plot = TRUE)

# Example 2b: Exclude missing data patterns with less than 3 cases
na.pattern(airquality, plot = TRUE, n.pattern = 3)

# Example 3: Vector of missing data pattern for each case
dat.pattern$pattern

# Data frame without cases with missing data pattern 2 and 4
airquality[!dat.pattern$pattern == 2, ]

if (FALSE) {
# Example 4a: Write Results into a text file
na.pattern(airquality, write = "NA_Pattern.xlsx")

# Example 4b: Write Results into a Excel file
na.pattern(airquality, write = "NA_Pattern.xlsx")}

Run the code above in your browser using DataLab