var: Variance and Standard Deviation

Description

Generic functions for the variance and standard deviation, and methods for individual and grouped data.

The default methods for individual data are the functions from the stats package.

Usage

var(x, ...)
# S3 method for default
var(x, y = NULL, na.rm = FALSE, use, ...)
# S3 method for grouped.data
var(x, ...)
sd(x, ...)
# S3 method for default
sd(x, na.rm = FALSE, ...)
# S3 method for grouped.data
sd(x, ...)

Value

A named vector of variances or standard deviations.

Arguments

x: a vector or matrix of individual data, or an object of class "grouped data".
y: see stats::var.
na.rm: see stats::var.
use: see stats::var.
...: further arguments passed to or from other methods.

Author

Vincent Goulet vincent.goulet@act.ulaval.ca. Variance and standard deviation methods for grouped data contributed by Walter Garcia-Fontes walter.garcia@upf.edu.

Details

This page documents variance and standard deviation computations for grouped data. For individual data, see var and sd from the stats package.

For grouped data with group boundaries $c_0, c_1, \dots, c_r$ and group frequencies $n_1, \dots, n_r$, var computes the sample variance $$\frac{1}{n - 1} \sum_{j = 1}^r n_j (a_j - m_1)^2,$$ where $a_j = (c_{j - 1} + c_j)/2$ is the midpoint of the $j$th interval, $m_1$ is the sample mean (or sample first moment) of the data, and $n = \sum_{j = 1}^r n_j$. The sample sample standard deviation is the square root of the sample variance.

The sample variance for grouped data differs from the variance computed from the empirical raw moments with emm in two aspects. First, it takes into account the degrees of freedom. Second, it applies Sheppard's correction factor to compensate for the overestimation of the true variation in the data. For groups of equal width $k$, Sheppard's correction factor is equal to $-k^2/12$.

References

Klugman, S. A., Panjer, H. H. and Willmot, G. E. (1998), Loss Models, From Data to Decisions, Wiley.

Heumann, C., Schomaker, M., Shalabh (2016), Introduction to Statistics and Data Analysis, Springer.

Examples

Run this code

data(gdental)
var(gdental)
sd(gdental)

## Illustration of Sheppard's correction factor
cj <- c(0, 2, 4, 6, 8)
nj <- c(1, 5,  3,  2)
gd <- grouped.data(Group = cj, Frequency = nj)
(sum(nj) - 1)/sum(nj) * var(gd)
(emm(gd, 2) - emm(gd)^2) - 4/12

Run the code above in your browser using DataLab