If x
contains any missing (NA
), undefined (NaN
) or
infinite (Inf
, -Inf
) values, they will be removed prior to
performing the estimation.
Let \(\underline{x} = (x_1, x_2, \ldots, x_n)\) be a vector of
\(n\) observations from a generalized extreme value distribution with
parameters location=
\(\eta\), scale=
\(\theta\), and
shape=
\(\kappa\).
Estimation
Maximum Likelihood Estimation (method="mle"
)
The log likelihood function is given by:
$$L(\eta, \theta, \kappa) = -n \, log(\theta) - (1 - \kappa) \sum^n_{i=1} y_i - \sum^n_{i=1} e^{y_i}$$
where
$$y_i = -\frac{1}{\kappa} log[\frac{1 - \kappa(x_i - \eta)}{\theta}]$$
(see, for example, Jenkinson, 1969; Prescott and Walden, 1980; Prescott and Walden,
1983; Hosking, 1985; MacLeod, 1989). The maximum likelihood estimators (MLE's) of
\(\eta\), \(\theta\), and \(\kappa\) are those values that maximize the
likelihood function, subject to the following constraints:
$$\theta > 0$$
$$\kappa \le 1$$
$$x_i < \eta + \frac{\theta}{\kappa} \; if \kappa > 0$$
$$x_i > \eta + \frac{\theta}{\kappa} \; if \kappa < 0$$
Although in theory the value of \(\kappa\) may lie anywhere in the interval
\((-\infty, \infty)\) (see GEVD), the constraint \(\kappa \le 1\) is
imposed because when \(\kappa > 1\) the likelihood can be made infinite and
thus the MLE does not exist (Castillo and Hadi, 1994). Hence, this method of
estimation is not valid when the true value of \(\kappa\) is larger than 1.
Hosking (1985) and Hosking et al. (1985) note that in practice the value of
\(\kappa\) tends to lie in the interval \(-1/2 < \kappa < 1/2\).
The value of \(-L\) is minimized using the R function nlminb
.
Prescott and Walden (1983) give formulas for the gradient and Hessian. Only
the gradient is supplied in the call to nlminb
. The values of
the PWME (see below) are used as the starting values. If the starting value of
\(\kappa\) is less than 0.001 in absolute value, it is reset to
sign(k) * 0.001
, as suggested by Hosking (1985).
Probability-Weighted Moments Estimation (method="pwme"
)
The idea of probability-weighted moments was introduced by Greenwood et al. (1979).
Landwehr et al. (1979) derived probability-weighted moment estimators (PWME's) for
the parameters of the Type I (Gumbel) extreme value distribution.
Hosking et al. (1985) extended these results to the generalized extreme value
distribution. See the abstract for Hosking et al. (1985)
for details on how these estimators are computed.
Two-Stage Order Statistics Estimation (method="tsoe"
)
The two-stage order statistics estimator (TSOE) was introduced by
Castillo and Hadi (1994) as an alternative to the MLE and PWME. Unlike the
MLE and PWME, the TSOE of \(\kappa\) exists for all combinations of sample
values and possible values of \(\kappa\). See the
abstract for Castillo and Hadi (1994) for details
on how these estimators are computed. In the second stage,
Castillo and Hadi (1984) suggest using either the median or the least median of
squares as the robust function. The function egevd
allows three options
for the robust function: median (tsoe.method="med"
; see the R help file for
median
), least median of squares (tsoe.method="lms"
;
see the help file for lmsreg
in the package MASS),
and least trimmed squares (tsoe.method="lts"
; see the help file for
ltsreg
in the package MASS).
Confidence Intervals
When ci=TRUE
, an approximate \((1-\alpha)\)100% confidence intervals
for \(\eta\) can be constructed assuming the distribution of the estimator of
\(\eta\) is approximately normally distributed. A two-sided confidence
interval is constructed as:
$$[\hat{\eta} - t(n-1, 1-\alpha/2) \hat{\sigma}_{\hat{\eta}}, \, \hat{\eta} + t(n-1, 1-\alpha/2) \hat{\sigma}_{\hat{\eta}}]$$
where \(t(\nu, p)\) is the \(p\)'th quantile of Student's t-distribution with
\(\nu\) degrees of freedom, and the quantity
$$\hat{\sigma}_{\hat{\eta}}$$
denotes the estimated asymptotic standard deviation of the estimator of \(\eta\).
Similarly, a two-sided confidence interval for \(\theta\) is constructed as:
$$[\hat{\theta} - t(n-1, 1-\alpha/2) \hat{\sigma}_{\hat{\theta}}, \, \hat{\theta} + t(n-1, 1-\alpha/2) \hat{\sigma}_{\hat{\theta}}]$$
and a two-sided confidence interval for \(\kappa\) is constructed as:
$$[\hat{\kappa} - t(n-1, 1-\alpha/2) \hat{\sigma}_{\hat{\kappa}}, \, \hat{\kappa} + t(n-1, 1-\alpha/2) \hat{\sigma}_{\hat{\kappa}}]$$
One-sided confidence intervals for \(\eta\), \(\theta\), and \(\kappa\) are
computed in a similar fashion.
Maximum Likelihood Estimator (method="mle"
)
Prescott and Walden (1980) derive the elements of the Fisher information matrix
(the expected information). The inverse of this matrix, evaluated at the values
of the MLE, is the estimated asymptotic variance-covariance matrix of the MLE.
This method is used to estimate the standard deviations of the estimated
distribution parameters when information="expected"
. The necessary
regularity conditions hold for \(\kappa < 1/2\). Thus, this method of
constructing confidence intervals is not valid when the true value of
\(\kappa\) is greater than or equal to 1/2.
Prescott and Walden (1983) derive expressions for the observed information matrix
(i.e., the Hessian). This matrix is used to compute the estimated asymptotic
variance-covariance matrix of the MLE when information="observed"
.
In computer simulations, Prescott and Walden (1983) found that the
variance-covariance matrix based on the observed information gave slightly more
accurate estimates of the variance of MLE of \(\kappa\) compared to the
estimated variance based on the expected information.
Probability-Weighted Moments Estimator (method="pwme"
)
Hosking et al. (1985) show that these estimators are asymptotically multivariate
normal and derive the asymptotic variance-covariance matrix. See the
abstract for Hosking et al. (1985) for details on how
this matrix is computed.
Two-Stage Order Statistics Estimator (method="tsoe"
)
Currently there is no built-in method in EnvStats for computing confidence
intervals when
method="tsoe"
. Castillo and Hadi (1994) suggest
using the bootstrap or jackknife method.