Learn R Programming

VIM (version 3.0.2)

pbox: Parallel boxplots with information about missing/imputed values

Description

Boxplot of one variable of interest plus information about missing/imputed values in other variables.

Usage

pbox(x, delimiter = NULL, pos = 1, selection = c("none","any","all"),
    col = c("skyblue","red","red4","orange","orange4"), numbers = TRUE, 
    cex.numbers = par("cex"), xlim = NULL, ylim = NULL, main = NULL,
    sub = NULL, xlab = NULL, ylab = NULL, axes = TRUE,
    frame.plot = axes, labels = axes, interactive = TRUE, ...)

TKRpbox(x, pos = 1, ..., delimiter = NULL, hscale = NULL, vscale = 1, TKRpar = list())

Arguments

x
a vector, matrix or data.frame.
delimiter
a character-vector to distinguish between variables and imputation-indices for imputed variables (therefore, x needs to have colnames). If given, it is used to determine the cor
pos
a numeric value giving the index of the variable of interest. Additional variables in x are used for grouping according to missingness/number of imputed missings.
selection
the selection method for grouping according to missingness/number of imputed missings in multiple additional variables. Possible values are "none" (grouping according to missingness/number of imputed missings in every other va
col
a vector of length five giving the colors to be used in the plot. The first color is used for the boxplots of the available data, the second/fourth are used for missing/imputed data, respectively, and the third/fifth color for th
numbers
a logical indicating whether the frequencies of missing/imputed values should be displayed (see Details).
cex.numbers
the character expansion factor to be used for the frequencies of the missing/imputed values.
xlim, ylim
axis limits.
main, sub
main and sub title.
xlab, ylab
axis labels.
axes
a logical indicating whether axes should be drawn on the plot.
frame.plot
a logical indicating whether a box should be drawn around the plot.
labels
either a logical indicating whether labels should be plotted below each box, or a character vector giving the labels.
interactive
a logical indicating whether variables can be switched interactively (see Details).
...
for pbox, further arguments and graphical parameters to be passed to boxplot and other functions. For TKRpbox, further arguments to be passed to pb
hscale
horizontal scale factor for plot to be embedded in a Tcl/Tk window (see Details). The default value depends on the number of boxes to be drawn.
vscale
vertical scale factor for the plot to be embedded in a Tcl/Tk window (see Details).
TKRpar
a list of graphical parameters to be set for the plot to be embedded in a Tcl/Tk window (see Details and par).

Value

Details

This plot consists of several boxplots. First, a standard boxplot of the variable of interest is produced. Second, boxplots grouped by observed and missing/imputed values according to selection are produced for the variable of interest.

Additionally, the frequencies of the missing/imputed values can be represented by numbers. If so, the first line corresponds to the observed values of the variable of interest and their distribution in the different groups, the second line to the missing/imputed values. If interactive=TRUE, clicking in the left margin of the plot results in switching to the previous variable and clicking in the right margin results in switching to the next variable. Clicking anywhere else on the graphics device quits the interactive session. TKRpbox behaves like pbox with selection="none", but uses tkrplot to embed the plot in a Tcl/Tk window. This is useful for drawing a large number of parallel boxes, because scrollbars allow to move from one part of the plot to another.

References

M. Templ, A. Alfons, P. Filzmoser (2012) Exploring incomplete data using visualization tools. Journal of Advances in Data Analysis and Classification, Online first. DOI: 10.1007/s11634-011-0102-y.

See Also

parcoordMiss

Examples

Run this code
data(chorizonDL, package = "VIM")
## for missing values
pbox(log(chorizonDL[, c(4,5,8,10,11,16:17,19,25,29,37,38,40)]))

## for imputed values
pbox(kNN(log(chorizonDL[, c(4,8,10,11,17,19,25,29,37,38,40)])),
     delimiter = "_imp")

Run the code above in your browser using DataLab