Learn R Programming

robustbase (version 0.95-1)

adjboxStats: Statistics for Skewness-adjusted Boxplots

Description

Computes the “statistics” for producing boxplots adjusted for skewed distributions as proposed in Hubert and Vandervieren (2008), see adjbox.

Usage

adjboxStats(x, coef = 1.5, a = -4, b = 3, do.conf = TRUE, do.out = TRUE,
            ...)

Value

A list with the components

stats

a vector of length 5, containing the extreme of the lower whisker, the lower hinge, the median, the upper hinge and the extreme of the upper whisker.

n

the number of observations

conf

the lower and upper extremes of the ‘notch’ (if(do.conf)). See boxplot.stats.

fence

length 2 vector of interval boundaries which define the non-outliers, and hence the whiskers of the plot.

out

the values of any data points which lie beyond the fence, and hence beyond the extremes of the whiskers.

Arguments

x

a numeric vector for which adjusted boxplot statistics are computed.

coef

number determining how far ‘whiskers’ extend out from the box, see boxplot.stats.

a, b

scaling factors multiplied by the medcouple mc() to determine outlyer boundaries; see the references.

do.conf,do.out

logicals; if FALSE, the conf or out component respectively will be empty in the result.

...

further optional arguments to be passed to mc(), such as doReflect.

Author

R Core Development Team (boxplot.stats); adapted by Tobias Verbeke and Martin Maechler.

Details

Given the quartiles \(Q_1\), \(Q_3\), the interquartile range \(\Delta Q := Q_3 - Q_1\), and the medcouple \(M :=\)mc(x), \(c =\)coef, the “fence” is defined, for \(M \ge 0\) as $$[Q_1 - c e^{a \cdot M}\Delta Q, Q_3 + c e^{b \cdot M}\Delta Q],% $$ and for \(M < 0\) as $$[Q_1 - c e^{-b \cdot M}\Delta Q, Q_3 + c e^{-a \cdot M}\Delta Q],% $$ and all observations x outside the fence, the “potential outliers”, are returned in out.

Note that a typo in robustbase version up to 0.7-8, for the (rare left-skewed) case where mc(x) < 0, lead to a “fence” not wide enough in the upper part, and hence less outliers there.

See Also

adjbox(), also for references, the function which mainly uses this one; further boxplot.stats.

Examples

Run this code
data(condroz)
adjboxStats(ccA <- condroz[,"Ca"])
adjboxStats(ccA, doReflect = TRUE)# small difference in fence

## Test reflection invariance [was not ok, up to and including robustbase_0.7-8]
a1 <- adjboxStats( ccA, doReflect = TRUE)
a2 <- adjboxStats(-ccA, doReflect = TRUE)

nm1 <- c("stats", "conf", "fence")
stopifnot(all.equal(       a1[nm1],
                    lapply(a2[nm1], function(u) rev(-u))),
          all.equal(a1[["out"]], -a2[["out"]]))

Run the code above in your browser using DataLab