Create a bivariate jitter plot.
scattJitt(
x,
delimiter = NULL,
col = c("skyblue", "red", "red4", "orange", "orange4"),
alpha = NULL,
cex = par("cex"),
col.line = "lightgrey",
lty = "dashed",
lwd = par("lwd"),
numbers = TRUE,
cex.numbers = par("cex"),
main = NULL,
sub = NULL,
xlab = NULL,
ylab = NULL,
axes = TRUE,
frame.plot = axes,
labels = c("observed", "missing", "imputed"),
...
)
a data.frame
or matrix
with two columns.
a character-vector to distinguish between variables and
imputation-indices for imputed variables (therefore, x
needs to have
colnames()
). If given, it is used to determine the corresponding
imputation-index for any imputed variable (a logical-vector indicating which
values of the variable have been imputed). If such imputation-indices are
found, they are used for highlighting and the colors are adjusted according
to the given colors for imputed variables (see col
).
a vector of length five giving the colors to be used in the plot. The first color will be used for complete observations, the second/fourth color for missing/imputed values in only one variable, and the third/fifth color for missing/imputed values in both variables. If only one color is supplied, it is used for all. Else if two colors are supplied, the second one is recycled.
a numeric value between 0 and 1 giving the level of
transparency of the colors, or NULL
. This can be used to prevent
overplotting.
the character expansion factor for the plot characters.
the color for the lines dividing the plot region.
the line type for the lines dividing the plot region (see
graphics::par()
).
the line width for the lines dividing the plot region.
a logical indicating whether the frequencies of observed and missing/imputed values should be displayed (see ‘Details’).
the character expansion factor to be used for the frequencies of the observed and missing/imputed values.
main and sub title.
axis labels.
a logical indicating whether both axes should be drawn on the
plot. Use graphical parameter "xaxt"
or "yaxt"
to suppress
just one of the axes.
a logical indicating whether a box should be drawn around the plot.
a vector of length three giving the axis labels for the regions for observed, missing and imputed values (see ‘Details’).
further graphical parameters to be passed down (see
graphics::par()
).
Matthias Templ, modifications by Andreas Alfons and Bernd Prantner
The amount of observed and missing/imputed values is visualized by jittered points. Thereby the plot region is divided into up to four regions according to the existence of missing/imputed values in one or both variables. In addition, the amount of observed and missing/imputed values can be represented by a number.
M. Templ, A. Alfons, P. Filzmoser (2012) Exploring incomplete data using visualization tools. Journal of Advances in Data Analysis and Classification, Online first. DOI: 10.1007/s11634-011-0102-y.
Other plotting functions:
aggr()
,
barMiss()
,
histMiss()
,
marginmatrix()
,
marginplot()
,
matrixplot()
,
mosaicMiss()
,
pairsVIM()
,
parcoordMiss()
,
pbox()
,
scattMiss()
,
scattmatrixMiss()
,
spineMiss()
data(tao, package = "VIM")
## for missing values
scattJitt(tao[, c("Air.Temp", "Humidity")])
## for imputed values
scattJitt(kNN(tao[, c("Air.Temp", "Humidity")]), delimiter = "_imp")
Run the code above in your browser using DataLab