Learn R Programming

DescTools (version 0.99.19)

PlotBag: PlotBag, a Bivariate Boxplot

Description

compute.PlotBag() computes an object describing a PlotBag of a bivariate data set. plot.PlotBag() plots a bagplot object. PlotBag() computes and plots a bagplot.

Usage

PlotBag(x, y, factor = 3, na.rm = FALSE, approx.limit = 300, show.outlier = TRUE, show.whiskers = TRUE, show.looppoints = TRUE, show.bagpoints = TRUE, show.loophull = TRUE, show.baghull = TRUE, create.plot = TRUE, add = FALSE, pch = 16, cex = 0.4, dkmethod = 2, precision = 1, verbose = FALSE, debug.plots = "no", col.loophull = "#aaccff", col.looppoints = "#3355ff", col.baghull = "#7799ff", col.bagpoints = "#000088", transparency = FALSE, ... ) PlotBagPairs(dm, trim = 0.0, main, numeric.only = TRUE, factor = 3, approx.limit = 300, pch = 16, cex = 0.8, precision = 1, col.loophull = "#aaccff", col.looppoints = "#3355ff", col.baghull = "#7799ff", col.bagpoints = "#000088", ...)
compute.bagplot(x, y, factor = 3, na.rm = FALSE, approx.limit = 300, dkmethod = 2, precision = 1, verbose = FALSE, debug.plots = "no" )
"plot"(x, show.outlier = TRUE, show.whiskers = TRUE, show.looppoints = TRUE, show.bagpoints = TRUE, show.loophull = TRUE, show.baghull = TRUE, add = FALSE, pch = 16, cex = .4, verbose = FALSE, col.loophull = "#aaccff", col.looppoints = "#3355ff", col.baghull = "#7799ff", col.bagpoints = "#000088", transparency = FALSE,...)

Arguments

x
x values of a data set; in PlotBag: an object of class PlotBag computed by compute.PlotBag
y
y values of the data set
factor
factor defining the loop
na.rm
if TRUE 'NA' values are removed otherwise exchanged by median
approx.limit
if the number of data points exceeds approx.limit a sample is used to compute some of the quantities; default: 300
show.outlier
if TRUE outlier are shown
show.whiskers
if TRUE whiskers are shown
show.looppoints
if TRUE loop points are plottet
show.bagpoints
if TRUE bag points are plottet
show.loophull
if TRUE the loop is plotted
show.baghull
if TRUE the bag is plotted
create.plot
if FALSE no plot is created
add
if TRUE the bagplot is added to an existing plot
pch
sets the plotting character
cex
sets characters size
dkmethod
1 or 2, there are two method of approximating the bag, method 1 is very rough (only based on observations
precision
precision of approximation, default: 1
verbose
automatic commenting of calculations
debug.plots
if TRUE additional plots describing intermediate results are constructed
col.loophull
color of loop hull
col.looppoints
color of the points of the loop
col.baghull
color of bag hull
col.bagpoints
color of the points of the bag
transparency
see section details
dm
x
trim
x
main
x
numeric.only
x
...
additional graphical parameters

Value

compute.bagplot returns an object of class bagplot that could be plotted by plot.bagplot(). An object of the bagplot class is a list with the following elements: center is a two dimensional vector with the coordinates of the center. hull.center is a two column matrix, the rows are the coordinates of the corners of the center region. hull.bag and hull.loop contain the coordinates of the hull of the bag and the hull of the loop. pxy.bag shows you the coordinates of the points of the bag. pxy.outer is the two column matrix of the points that are within the fence. pxy.outlier represent the outliers. The vector hdepths shows the depths of data points. is.one.dim is TRUE if the data set is (nearly) one dimensional. The dimensionality is decided by analysing the result of prcomp which is stored in the element prdata. xy shows you the data that are used for the bagplot. In the case of very large data sets subsets of the data are used for constructing the bagplot. A data set is very large if there are more data points than approx.limit. xydata are the input data structured in a two column matrix.

Details

A bagplot is a bivariate generalization of the well known boxplot. It has been proposed by Rousseeuw, Ruts, and Tukey. In the bivariate case the box of the boxplot changes to a convex polygon, the bag of bagplot. In the bag are 50 percent of all points. The fence separates points within the fence from points outside. It is computed by increasing the the bag. The loop is defined as the convex hull containing all points inside the fence. If all points are on a straight line you get a classical boxplot. PlotBag() plots bagplots that are very similar to the one described in Rousseeuw et al. Remarks: The two dimensional median is approximated. For large data sets the error will be very small. On the other hand it is not very wise to make a (graphical) summary of e.g. 10 bivariate data points.

In case you want to plot multiple (overlapping) bagplots, you may want plots that are semi-transparent. For this you can use the transparency flag. If transparency==TRUE the alpha layer is set to '99' (hex). This causes the bagplots to appear semi-transparent, but ONLY if the output device is PDF and opened using: pdf(file="filename.pdf", version="1.4"). For this reason, the default is transparency==FALSE. This feature as well as the arguments to specify different colors has been proposed by Wouter Meuleman.

References

P. J. Rousseeuw, I. Ruts, J. W. Tukey (1999): The bagplot: a bivariate boxplot, The American Statistician, vol. 53, no. 4, 382--387

See Also

boxplot

Examples

Run this code
  # example: 100 random points and one outlier
  dat <- cbind(rnorm(100) + 100, rnorm(100) + 300)
  dat <- rbind(dat, c(105,295))
  PlotBag(dat,factor=2.5,create.plot=TRUE,approx.limit=300,
     show.outlier=TRUE,show.looppoints=TRUE,
     show.bagpoints=TRUE,dkmethod=2,
     show.whiskers=TRUE,show.loophull=TRUE,
     show.baghull=TRUE,verbose=FALSE)
  # example of Rousseeuw et al., see R-package rpart
  cardata <- structure(as.integer( c(2560,2345,1845,2260,2440,
   2285, 2275, 2350, 2295, 1900, 2390, 2075, 2330, 3320, 2885,
   3310, 2695, 2170, 2710, 2775, 2840, 2485, 2670, 2640, 2655,
   3065, 2750, 2920, 2780, 2745, 3110, 2920, 2645, 2575, 2935,
   2920, 2985, 3265, 2880, 2975, 3450, 3145, 3190, 3610, 2885,
   3480, 3200, 2765, 3220, 3480, 3325, 3855, 3850, 3195, 3735,
   3665, 3735, 3415, 3185, 3690, 97, 114, 81, 91, 113, 97, 97,
   98, 109, 73, 97, 89, 109, 305, 153, 302, 133, 97, 125, 146,
   107, 109, 121, 151, 133, 181, 141, 132, 133, 122, 181, 146,
   151, 116, 135, 122, 141, 163, 151, 153, 202, 180, 182, 232,
   143, 180, 180, 151, 189, 180, 231, 305, 302, 151, 202, 182,
   181, 143, 146, 146)), .Dim = as.integer(c(60, 2)),
   .Dimnames = list(NULL, c("Weight", "Disp.")))
  PlotBag(cardata,factor=3,show.baghull=TRUE,
    show.loophull=TRUE,precision=1, dkmethod=2)
  title("car data Chambers/Hastie 1992")
  # points of y=x*x
  PlotBag(x=1:30,y=(1:30)^2,verbose=FALSE,dkmethod=2)
  # one dimensional subspace
  PlotBag(x=1:100,y=1:100)

Run the code above in your browser using DataLab