Learn R Programming

rgr (version 1.0.4)

tbplot.by.var: Plot Vertical Tukey Boxplots for Variables

Description

Plots a series of vertical Tukey boxplots where the individual boxplots represent the data subdivided by variables. Optionally the y-axis may be scaled logarithmically. A variety of other plot options are available, see Details and Note below.

Usage

tbplot.by.var(xmat, log = FALSE, logx = FALSE, notch = FALSE, 
	xlab = "Measured Variables", ylab = "Reported Values", 
	main = "", label = NULL, plot.order = NULL, xpos = NA, 
	las = 1, cex = 1, adj = 0.5, colr = 8, ...)

Arguments

xmat
the data matrix or data frame containing the data.
log
if it is required to display the data with logarithmic (y-axis) scaling, set log = TRUE.
logx
if the positions of the Tukey boxplot fences are to be computed on the basis of log transformed data set logx = TRUE. For general usage, if log = TRUE then set logx = TRUE.
notch
determines if the boxplots are to be notched such that the notches indicate the 95% confidence intervals for the medians. The default is not to notch the boxplots, to have notches set notch = TRUE.
xlab
a title for the x-axis, by default xlab = "Measured Variables".
ylab
a title for the y-axis, by default ylab = "Reported Values".
main
a main title may be added optionally above the display by setting main, e.g., main = "Kola Project, 1995".
label
provides an alternate set of labels for the boxplots along the x-axis. By default the character strings defining the factors (variables) are used. Thus, label = c("Alt1", "Alt2", "Alt3").
plot.order
provides an alternate order for the boxplots. By default the boxplot are plotted in alphabetical order of the factor variables. Thus, plot.order = c(2, 1, 3) will plot the 2nd alphabetically ordered factor in the 1st position, the 1st in th
xpos
the locations along the x-axis for the individual vertical boxplots to be plotted. By default this is set to NA, which causes default equally spaced positions to be used, i.e. boxplot 1 plots at value 1 on the x-axis, boxplot 2 at value 2, e
las
controls whether the x-axis labels are written parallel to the x-axis, the default las = 1, or are written down from the x-axis by setting las = 2. See also, Details below.
cex
controls the size of the font used for the factor labels plotted along the x-axis. By default this is 1, however, if the labels are long it is sometimes necessary to use a smaller font, for example cex = 0.8 results in a font 80
adj
controls the justification of the x-axis labels. By default they are centred, adj = 0.5, to left justify them if the labels are written downwards set adj = 0.
colr
by default the boxes are infilled in grey, colr = 8 . If no infill is required, set colr = 0. See display.lty for the range of available colours.
...
further arguments to be passed to methods.

Details

There are two ways to provide data to this function. Firstly, if all the variables in a data frame are to be displayed, and there are no factor variables, the data frame name can be entered for xmat. However, if there are factor variables, or only a subset of the variables are to be displayed, the data are entered via the cbind construct, see Examples below. Long variable names can lead to display problems, changing the las parameter from its default of las = 1 which plots subset labels parallel to the axis to las = 2, to plot perpendicular to the axis, can help. It may also help to use label and split the character string into two lines, e.g., by changing the string "Specific Conductivity" that was supplied to replace the variable name SC to "Specific\nConductivity". If this, or setting las = 2, causes a conflict with the x-axis title, if one is needed, the title can be moved down a line by using xlab = "\nPhysical soil properties". In both cases the \n forces the following text to be placed on the next lower line. If there are more than 7 labels (variables) and no alternate labels are provided las is set to 2, otherwise some variable names may fail to be displayed. The notches in the boxplots indicate the 95% confidence intervals for the medians and can extend beyond the upper and lower limits of the boxes indicating the middle 50% of the data when subset population sizes are small. The confidence intervals are estimated using the binomial theorem. It can be argued that for small populations a normal approximation would be better. However, it was decided to remain with a non-parametric estimate despite the fact that the calculation of the Tukey fence values involves normality assumptions.

See Also

tbplot, var2fact, ltdl.fix.df

Examples

Run this code
## Make test data kola.c available
data(kola.c)
attach(kola.c)

## Display a simple Tukey boxplot for measured variables
tbplot.by.var(cbind(Co,Cu,Ni))

## Display a more appropriately labelled and scaled Tukey boxplot
tbplot.by.var(cbind(Co,Cu,Ni), log = TRUE, logx = TRUE,
	ylab = "Levels (mg/kg) in <2 mm C-horizon soil")

## Detach test data kola.c
detach(kola.c)

## Make test data ms.data1 available
data(ms.data1)

## Display variables in a data frame
tbplot.by.var(ms.data1, log=TRUE, logx = TRUE)

Run the code above in your browser using DataLab