nclass: Compute the Number of Classes for a Histogram

Description

Compute the number of classes for a histogram.

Usage

nclass.Sturges(x)
nclass.scott(x)
nclass.FD(x)

Arguments

a data vector.

Value

The suggested number of classes.

Details

nclass.Sturges uses Sturges' formula, implicitly basing bin sizes on the range of the data.

nclass.scott uses Scott's choice for a normal distribution based on the estimate of the standard error, unless that is zero where it returns 1.

nclass.FD uses the Freedman-Diaconis choice based on the inter-quartile range (IQR(signif(x, 5))) unless that's zero where it uses increasingly more extreme symmetric quantiles up to c(1,511)/512 and if that difference is still zero, reverts to using Scott's choice.

References

Venables, W. N. and Ripley, B. D. (2002) Modern Applied Statistics with S-PLUS. Springer, page 112.

Freedman, D. and Diaconis, P. (1981). On the histogram as a density estimator: \(L_2\) theory. Zeitschrift f<U+00FC>r Wahrscheinlichkeitstheorie und verwandte Gebiete, 57, 453--476. 10.1007/BF01025868.

Scott, D. W. (1979). On optimal and data-based histograms. Biometrika, 66, 605--610. 10.2307/2335182.

Scott, D. W. (1992) Multivariate Density Estimation. Theory, Practice, and Visualization. Wiley.

Sturges, H. A. (1926). The choice of a class interval. Journal of the American Statistical Association, 21, 65--66. 10.1080/01621459.1926.10502161.

Examples

Run this code

# NOT RUN {
set.seed(1)
x <- stats::rnorm(1111)
nclass.Sturges(x)

## Compare them:
NC <- function(x) c(Sturges = nclass.Sturges(x),
      Scott = nclass.scott(x), FD = nclass.FD(x))
NC(x)
onePt <- rep(1, 11)
NC(onePt) # no longer gives NaN
# }

Run the code above in your browser using DataLab