Learn R Programming

rgr (version 1.1.15)

inset: An EDA Graphical and Statistical Summary

Description

Plots a three panel graphical distributional summary for a data set, comprising a histogram and a cumulative normal percentage probability (CPP) plot, together with a table of selected percentiles of the data and summary statistics between them. Optionally the EDA graphics may be plotted with base 10 logarithmic scaling.

Usage

inset(xx, xlab = deparse(substitute(xx)), log = FALSE, xlim = NULL, 
	nclass = NULL, colr = NULL, ifnright = TRUE, table.cex = 0.7, ...)

Arguments

xx

name of the variable to be plotted.

xlab

by default the character string for xx is used for the x-axis plot titles. An alternate title can be displayed with xlab = "text string", see Examples.

log

to display the data with logarithmic (x-axis) scaling, set log = TRUE.

xlim

default limits of the x-axis are determined in the function. However when used stand-alone the limits may be user-defined by setting xlim, see Note below.

nclass

the default procedure for preparing the histogram depends on sample size. Where N <= 500 the Scott (1979) rule is used, and when N > 500 the Freedman-Diaconis (1981) rule; both these rules are resistant to the presence of outliers, and usually provide informative histograms. Alternately, the user may define the histogram binning by setting nclass, i.e. nclass = "scott", nclass = "fd" or nclass = "sturges"; the latter being designed for normal distributions (Scott, 1992). See Venables and Ripley (2001) for details.

colr

by default the histogram is infilled in grey, colr = 8. If no infill is required, set colr = 0. See function display.lty for the range of available colours.

ifnright

controls where the sample size is plotted in the histogram display, by default this in the upper right corner of the plot. If the data distribution is such that the upper left corner would be preferable, set ifnright = FALSE. If neither option generates an acceptable plot, setting ifnright = NULL suppresses the display of the data set size.

table.cex

controls the size of the text in the central panel of summary statistics table, the default is table.cex = 0.7, which has been found to be optimal. If the entire table does not display, just parts of the columns, see Note below.

further arguments to be passed to methods. For example, by default individual data points in the CPP plot are marked by a plus sign, pch = 3, if a cross or open circle is desired, then set pch = 4 or pch = 1, respectively. See display.marks for all available symbols. Adding ifqs = TRUE results in horizontal and vertical dotted lines being plotted at the three central quartiles and their values, respectively, in the CPP plot.

Details

A histogram is displayed on the left, and a cumulative normal percentage probability plot on the right. Between the two is a table of simple summary statistics, computed by gx.stats, including minimum, maximum and percentile values, robust estimates of standard deviation, and the mean, standard deviation and coefficient of variation. The plots may be displayed with logarithmic axes, however, the summary statistics are not computed with a logarithmic transform.

References

Venables, W.N. and Ripley, B.D., 2001. Modern Applied Statistsis with S-Plus, 3rd Edition, Springer, 501 p. See pp. 119 for a description of histogram bin selection computations.

See Also

gx.hist, cnpplt, gx.stats, inset.exporter, ltdl.fix.df, remove.na

Examples

Run this code
# NOT RUN {
## Make test data available
data(kola.o)
attach(kola.o)

## Generates an initial display
inset(Cu)

## Provides a more appropriate display for pubication
inset(Cu, xlab = "Cu (mg/kg) in <2 mm O-horizon soil", log = TRUE)

## NOTE: The example statistics table may not display correctly

## Detach test data
detach(kola.o)
# }

Run the code above in your browser using DataLab