Let \(\underline{z} = z_1, z_2, \ldots, z_n\) denote a vector of \(n\)
observations from a normal distribution with parameters
mean=0
and sd=1
. That is, \(\underline{z}\) denotes a vector of
\(n\) observations from a standard normal distribution. Let
\(z_{(r)}\) denote the \(r\)'th order statistic of \(\underline{z}\),
for \(r = 1, 2, \ldots, n\). The probability density function of
\(z_{(r)}\) is given by:
$$f_{r,n}(t) = \frac{n!}{(r-1)!(n-r)!} [\Phi(t)]^{r-1} [1 - \Phi(t)]^{n-r} \phi(t) \;\;\;\;\;\; (1)$$
where \(\Phi\) and \(\phi\) denote the cumulative distribution function and
probability density function of the standard normal distribution, respectively
(Johnson et al., 1994, p.93). Thus, the expected value of \(z_{(r)}\) is given by:
$$E(r, n) = E[z_{(r)}] = \int_{-\infty}^{\infty} t f_{r,n}(t) dt \;\;\;\;\;\; (2)$$
It can be shown that if \(n\) is odd, then
$$E[(n+1)/2, n] = 0 \;\;\;\;\;\; (3)$$
Also, for all values of \(n\),
$$E(r, n) = -E(n-r+1, n) \;\;\;\;\;\; (4)$$
The function evNormOrdStatsScalar
computes the value of \(E(r,n)\) for
user-specified values of \(r\) and \(n\).
The function evNormOrdStats
computes the values of \(E(r,n)\) for all
values of \(r\) (i.e., for \(r = 1, 2, \ldots, n\))
for a user-specified value of \(n\).
Exact Method Based on Royston's Approximation to the Integral (method="royston"
)
When method="royston"
, the integral in Equation (2) above is approximated by
computing the value of the integrand between the values of lower
and
-lower
using increments of inc
, then summing these values and
multiplying by inc
. In particular, the integrand is restructured as:
$$t \; f_{r,n}(t) = t \; exp\{log(n!) - log[(r-1)!] - log[(n-r)!] + (r-1)log[\Phi(t)] + (n-r)log[1 - \Phi(t)] + log[\phi(t)]\} \;\;\; (5)$$
By default, as per Royston (1982), the integrand is evaluated between -9 and 9 in
increments of 0.025. The approximation is computed this way for values of
\(r\) between \(1\) and \([n/2]\), where \([x]\) denotes the floor of \(x\).
If \(r > [n/2]\), then the approximation is computed for \(E(n-r+1, n)\) and
Equation (4) is used.
Note that Equation (1) in Royston (1982) differs from Equations (1) and (2) above
because Royston's paper is based on the \(r^{th}\) largest value,
not the \(r^{th}\) order statistic.
Royston (1982) states that this algorithm “is accurate to at least seven decimal
places on a 36-bit machine,” that it has been validated up to a sample size
of \(n=2000\), and that the accuracy for \(n > 2000\) may be improved by
reducing the value of the argument inc
. Note that making
inc
smaller will increase the computation time.
Approxmation Based on Blom's Method (method="blom"
)
When method="blom"
, the following approximation to \(E(r,n)\),
proposed by Blom (1958, pp. 68-75), is used:
$$E(r, n) \approx \Phi^{-1}(\frac{r - \alpha}{n - 2\alpha + 1}) \;\;\;\;\;\; (5)$$
By default, \(\alpha = 3/8 = 0.375\). This approximation is quite accurate.
For example, for \(n \ge 2\), the approximation is accurate to the first decimal place,
and for \(n \ge 9\) it is accurate to the second decimal place.
Harter (1961) discusses appropriate values of \(\alpha\) for various sample sizes
\(n\) and values of \(r\).
Approximation Based on Monte Carlo Simulation (method="mc"
)
When method="mc"
, Monte Carlo simulation is used to estmate the expected value
of the \(r^{th}\) order statistic. That is, \(N =\) nmc
trials are run in which,
for each trial, a random sample of \(n\) standard normal observations is
generated and the \(r^{th}\) order statistic is computed. Then, the average value
of this order statistic over all \(N\) trials is computed, along with a
confidence interval for the expected value, assuming an approximately
normal distribution for the mean of the order statistic (the confidence interval
is computed by supplying the simulated values of the \(r^{th}\) order statistic
to the function enorm
).
NOTE: This method has not been optimized for large sample sizes \(n\)
(i.e., large values of the argument n
) and/or a large number of
Monte Carlo trials \(N\) (i.e., large values of the argument nmc
) and
may take a long time to execute in these cases.