Boxplot of one variable of interest plus information about missing/imputed values in other variables.
pbox(
x,
delimiter = NULL,
pos = 1,
selection = c("none", "any", "all"),
col = c("skyblue", "red", "red4", "orange", "orange4"),
numbers = TRUE,
cex.numbers = par("cex"),
xlim = NULL,
ylim = NULL,
main = NULL,
sub = NULL,
xlab = NULL,
ylab = NULL,
axes = TRUE,
frame.plot = axes,
labels = axes,
interactive = TRUE,
...
)
a list as returned by graphics::boxplot()
.
a vector, matrix or data.frame
.
a character-vector to distinguish between variables and
imputation-indices for imputed variables (therefore, x
needs to have
colnames()
). If given, it is used to determine the corresponding
imputation-index for any imputed variable (a logical-vector indicating which
values of the variable have been imputed). If such imputation-indices are
found, they are used for highlighting and the colors are adjusted according
to the given colors for imputed variables (see col
).
a numeric value giving the index of the variable of interest.
Additional variables in x
are used for grouping according to
missingness/number of imputed missings.
the selection method for grouping according to
missingness/number of imputed missings in multiple additional variables.
Possible values are "none"
(grouping according to missingness/number
of imputed missings in every other variable that contains missing/imputed
values), "any"
(grouping according to missingness/number of imputed
missings in any of the additional variables) and "all"
(grouping according to missingness/number of imputed missings in all
of the additional variables).
a vector of length five giving the colors to be used in the plot. The first color is used for the boxplots of the available data, the second/fourth are used for missing/imputed data, respectively, and the third/fifth color for the frequencies of missing/imputed values in both variables (see ‘Details’). If only one color is supplied, it is used for the boxplots for missing/imputed data, whereas the boxplots for the available data are transparent. Else if two colors are supplied, the second one is recycled.
a logical indicating whether the frequencies of missing/imputed values should be displayed (see ‘Details’).
the character expansion factor to be used for the frequencies of the missing/imputed values.
axis limits.
main and sub title.
axis labels.
a logical indicating whether axes should be drawn on the plot.
a logical indicating whether a box should be drawn around the plot.
either a logical indicating whether labels should be plotted below each box, or a character vector giving the labels.
a logical indicating whether variables can be switched interactively (see ‘Details’).
for pbox
, further arguments and graphical parameters to
be passed to graphics::boxplot()
and other functions. For
TKRpbox
, further arguments to be passed to pbox
.
Andreas Alfons, Matthias Templ, modifications by Bernd Prantner
This plot consists of several boxplots. First, a standard boxplot of the
variable of interest is produced. Second, boxplots grouped by observed and
missing/imputed values according to selection
are produced for the
variable of interest.
Additionally, the frequencies of the missing/imputed values can be represented by numbers. If so, the first line corresponds to the observed values of the variable of interest and their distribution in the different groups, the second line to the missing/imputed values.
If interactive=TRUE
, clicking in the left margin of the plot results
in switching to the previous variable and clicking in the right margin
results in switching to the next variable. Clicking anywhere else on the
graphics device quits the interactive session.
M. Templ, A. Alfons, P. Filzmoser (2012) Exploring incomplete data using visualization tools. Journal of Advances in Data Analysis and Classification, Online first. DOI: 10.1007/s11634-011-0102-y.
parcoordMiss()
Other plotting functions:
aggr()
,
barMiss()
,
histMiss()
,
marginmatrix()
,
marginplot()
,
matrixplot()
,
mosaicMiss()
,
pairsVIM()
,
parcoordMiss()
,
scattJitt()
,
scattMiss()
,
scattmatrixMiss()
,
spineMiss()
data(chorizonDL, package = "VIM")
## for missing values
pbox(log(chorizonDL[, c(4,5,8,10,11,16:17,19,25,29,37,38,40)]))
## for imputed values
pbox(kNN(log(chorizonDL[, c(4,8,10,11,17,19,25,29,37,38,40)])),
delimiter = "_imp")
Run the code above in your browser using DataLab