Learn R Programming

lessR (version 2.2)

bx: Color Boxplot

Description

Uses the standard R boxplot function, boxplot to display a boxplot in color. Also display the relevant statistics such as the hinges, median and IQR.

If the provided object for which to calculate the box plot is a data frame, then a box plot is calculated for each numeric variable in the data frame and the results written to a pdf file in the current working directory. The name of this file and its path are specified in the output.

Usage

bx(x=NULL, dframe=mydata, ...)

## S3 method for class 'data.frame': bx(x, \ldots)

## S3 method for class 'default': bx(x, col.box="lightsteelblue", col.pts=NULL, col.bg="ghostwhite", col.grid="grey85", horiz=TRUE, dotplot=FALSE, mag.axis=.85, xlab=NULL, main=NULL, digits.d=10, \ldots)

color.boxplot(...)

Arguments

x
Variable for which to construct the histogram. Can be a data frame. If not specified with dframe, that is, no variable specified, then the data frame mydata is assumed.
dframe
Optional data frame that contains the variable of interest, default is mydata.
col.box
Color of the box.
col.pts
Color of any points that designate outliers. By default this is the same color as the box.
col.bg
Color of the plot background.
col.grid
Color of the grid lines.
horiz
Orientation of the boxplot. Set FALSE for vertical.
dotplot
If TRUE, then place a dot plot (i.e., stipchart) over the box plot.
mag.axis
Scale magnification factor, which by defaults displays the axis values to be smaller than the axis labels.
xlab
Label for the value axis, which defaults to the variable's name.
main
Title of graph.
digits.d
Number of decimal digits displayed in the listing of the summary statistics.
...
Other parameter values for graphics as defined processed by boxplot and par, including ylim to set the limits of the value axis, lwd

Details

Unlike the standard R boxplot function, boxplot, the default here is for a horiz boxplot. Also, bx does not currently process in formula mode, so use the standard boxplot function to process a formula in which a boxplot is displayed for a variable at each level of a second, usually categorical, variable.

If the variable is in a data frame, the input data frame has the assumed name of mydata. If this data frame is named something different, then specify the name with the dframe option. Regardless of its name, the data frame need not be attached to reference the variable directly by its name, that is, no need to invoke the mydata$name notation. If no variable is specified, then all numeric variables in the entire data frame are analyzed and the results written to a pdf file.

Other graphic parameters are available to format the display, such as main for the title, and other parameters found in boxplot and par.

A labels data frame named mylabels, obtained from the rad function, can list the label for some or all of the variables in the data frame that contains the data for the analysis. If this labels data frame exists, then the corresponding variable label is listed as the title of the resulting plot, unless a specific label is listed with the main option. The varible label is also listed in the text output, next to the variable name.

To minimize white space around the boxplot, re-size the graphics window before or after creating the boxplot.

See Also

boxplot, par.

Examples

Run this code
# simulate data and get at least one outlier
y <- rnorm(100,50,10)
y[1] <- 90


# -----------------------------
# boxplot for a single variable
# -----------------------------

# standard horiz boxplot with all defaults
bx(y)

# vertical boxplot with plum color
bx(y, horiz=FALSE, col.box="plum")

# boxplot with outliers more strongly highlighted
bx(y, col.pts="red", xlab="My Variable")


# -----------------------------------------------
# boxplots for data frames and multiple variables
# -----------------------------------------------

# create data frame, mydata, to mimic reading data with rad function
# mydata contains both numeric and non-numeric data
mydata <- data.frame(rnorm(100), rnorm(100), rnorm(100), rep(c("A","B"),50))
names(mydata) <- c("X","Y","Z","C")
rm(X); rm(Y); rm(Z); rm(C)

# boxplot for variable X from data frame, referred to directly
bx(X)

# boxplot with superimposed dot plot (stripchart)
bx(X, dotplot=TRUE)

# boxplots for all numeric variables in data frame called mydata
bx()

# boxplots for all numeric variables in data frame called mydata
#  with specified options
bx(col.box="palegreen1", col.pts="plum")

# Use the subset function to specify a variable list
mysub <- subset(mydata, select=c(X,Y))
bx(dframe=mysub)

Run the code above in your browser using DataLab