Produces the Hill, AltHill, SmooHill and AltSmooHill plots,
including confidence intervals.
For an ordered iid sequence \(X_{(1)}\ge X_{(2)}\ge\cdots\ge X_{(n)} > 0\)
the Hill (1975) estimator using \(k\) order statistics is given by
$$H_{k,n}=\frac{1}{k}\sum_{i=1}^{k} \log(\frac{X_{(i)}}{X_{(k+1)}})$$
which is the pseudo-likelihood estimator of reciprocal of the tail index \(\xi=/\alpha>0\)
for regularly varying tails (e.g. Pareto distribution). The Hill estimator
is defined on orders \(k>2\), as when\(k=1\) the $$H_{1,n}=0$$. The
function will calculate the Hill estimator for \(k\ge 1\).
The simple Hill plot is shown for hill.type="Hill"
.
Once a sufficiently low order statistic is reached the Hill estimator will
be constant, upto sample uncertainty, for regularly varying tails. The Hill
plot is a plot of $$H_{k,n}$$ against the \(k\). Symmetric asymptotic
normal confidence intervals assuming Pareto tails are provided.
These so called Hill's horror plots can be difficult to interpret. A smooth
form of the Hill estimator was suggested by Resnick and Starica (1997):
$$smooH_{k,n}=\frac{1}{(r-1)k}\sum_{j=k+1}^{rk} H_{j,n}$$ giving the
smooHill plot which is shown for hill.type="SmooHill"
. The smoothing
factor is r=2
by default.
It has also been suggested to plot the order on a log scale, by plotting
the points \((\theta, H_{\lceil n^\theta\rceil, n})\) for
\(0\le \theta \le 1\). This gives the so called AltHill and AltSmooHill
plots. The alternative x-axis scale is chosen by x.theta=TRUE
.
The Hill estimator is for the GPD shape \(\xi>0\), or the reciprocal of the
tail index \(\alpha=1/\xi>0\). The shape is plotted by default using
y.alpha=FALSE
and the tail index is plotted when y.alpha=TRUE
.
A pre-chosen threshold (or more than one) can be given in
try.thresh
. The estimated parameter (\(\xi\) or \(\alpha\)) at
each threshold are plot by a horizontal solid line for all higher thresholds.
The threshold should be set as low as possible, so a dashed line is shown
below the pre-chosen threshold. If the Hill estimator is similar to the
dashed line then a lower threshold may be chosen.
If no order statistic (or threshold) limits are provided orderlim =
tlim = NULL
then the lowest order statistic is set to \(X_{(3)}\) and
highest possible value \(X_{(n-1)}\). However, the Hill estimator is always
output for all \(k=1, \ldots, n-1\) and \(k=1, \ldots, floor(n/k)\) for
smooHill estimator.
The missing (NA
and NaN
) and non-finite values are ignored.
Non-positive data are ignored.
The lower x-axis is the order \(k\) or \(\theta\), chosen by the option
x.theta=FALSE
and x.theta=TRUE
respectively. The upper axis
is for the corresponding threshold.