This page documents variance and standard deviation computations for
grouped data. For individual data, see var
and
sd
from the stats package.
For grouped data with group boundaries \(c_0, c_1, \dots,
c_r\) and group frequencies \(n_1, \dots,
n_r\), var
computes the sample variance
$$\frac{1}{n - 1} \sum_{j = 1}^r n_j (a_j - m_1)^2,$$
where
\(a_j = (c_{j - 1} + c_j)/2\)
is the midpoint of the \(j\)th interval,
\(m_1\) is the sample mean (or sample first moment) of the data,
and
\(n = \sum_{j = 1}^r n_j\).
The sample sample standard deviation is the square root of the sample
variance.
The sample variance for grouped data differs from the variance
computed from the empirical raw moments with emm
in two
aspects. First, it takes into account the degrees of freedom. Second,
it applies Sheppard's correction factor to compensate for the
overestimation of the true variation in the data. For groups of equal
width \(k\), Sheppard's correction factor is equal to \(-k^2/12\).