smoothtail-package: Smooth Estimation of GPD Shape Parameter

Description

Given independent and identically distributed observations $X_1 < \ldots < X_n$ from a Generalized Pareto distribution with shape parameter $\gamma \in [-1,0]$, offers three methods to compute estimates of $\gamma$. The estimates are based on the principle of replacing the order statistics $X_{(1)}, \ldots, X_{(n)}$ of the sample by quantiles $\hat X_{(1)}, \ldots, \hat X_{(n)}$ of the distribution function $\hat F_n$ based on the log--concave density estimator $\hat f_n$. This procedure is justified by the fact that the GPD density is log--concave for $\gamma \in [-1,0]$.

Arguments

Author

Kaspar Rufibach (maintainer), kaspar.rufibach@gmail.com ,
http://www.kasparrufibach.ch

Samuel Mueller, samuel.muller@mq.edu.au

Kaspar Rufibach acknowledges support by the Swiss National Science Foundation SNF, http://www.snf.ch

Details

Package:	smoothtail
Type:	Package
Version:	2.0.5
Date:	2016-07-12
License:	GPL (>=2)

Use this package to estimate the shape parameter $\gamma$ of a Generalized Pareto Distribution (GPD). In extreme value theory, $\gamma$ is denoted tail index. We offer three new estimators, all based on the fact that the density function of the GPD is log--concave if $\gamma \in [-1,0]$, see Mueller and Rufibach (2009). The functions for estimation of the tail index are:

pickands
falk
falkMVUE
generalizedPick

This package depends on the package logcondens for estimation of a log--concave density: all the above functions take as first argument a dlc object as generated by logConDens in logcondens.

Additionally, functions for density, distribution function, quantile function and random number generation for a GPD with location parameter 0, shape parameter $\gamma$ and scale parameter $\sigma$ are provided:

dgpd
pgpd
qgpd
rgpd.

Let us shortly clarify what we mean with log--concave density estimation. Suppose we are given an ordered sample $Y_1 < \ldots < Y_n$ of i.i.d. random variables having density function $f$, where $f = \exp \varphi$ for a concave function $\varphi : [-\infty, \infty) \to R$. Following the development in Duembgen and Rufibach (2009), it is then possible to get an estimator $\hat f_n = \exp \hat \varphi_n$ of $f$ via the maximizer $\hat \varphi_n$ of

$$L(\varphi) = \sum_{i=1}^n \varphi(Y_i) - \int \exp \varphi (t) d t$$

over all concave functions $\varphi$. It turns out that $\hat \varphi_n$ is piecewise linear, with knots only at (some of the) observation points. Therefore, the infinite-dimensional optimization problem of finding the function $\hat \varphi_n$ boils down to a finite dimensional problem of finding the vector $(\hat \varphi_n(Y_1),\ldots,\hat \varphi(Y_n))$. How to solve this problem is described in Rufibach (2006, 2007) and in a more general setting in Duembgen, Huesler, and Rufibach (2010). The distribution function based on $\hat f_n$ is defined as

$$\hat F_n(x) = \int_{Y_1}^x \hat f_n(t) d t$$

for $x$ a real number. The definition of $\hat F_n$ is justified by the fact that $\hat F_n(Y_1) = 0$.

References

Duembgen, L. and Rufibach, K. (2009) Maximum likelihood estimation of a log--concave density and its distribution function: basic properties and uniform consistency. Bernoulli, 15(1), 40--68.

Duembgen, L., Huesler, A. and Rufibach, K. (2010) Active set and EM algorithms for log-concave densities based on complete and censored data. Technical report 61, IMSV, Univ. of Bern, available at http://arxiv.org/abs/0707.4643.

Mueller, S. and Rufibach K. (2009). Smooth tail index estimation. J. Stat. Comput. Simul., 79, 1155--1167.

Mueller, S. and Rufibach K. (2008). On the max--domain of attraction of distributions with log--concave densities. Statist. Probab. Lett., 78, 1440--1444.

Rufibach K. (2006) Log-concave Density Estimation and Bump Hunting for i.i.d. Observations. PhD Thesis, University of Bern, Switzerland and Georg-August University of Goettingen, Germany, 2006.
Available at https://biblio.unibe.ch/download/eldiss/06rufibach_k.pdf.

Rufibach, K. (2007) Computing maximum likelihood estimators of a log-concave density function. J. Stat. Comput. Simul., 77, 561--574.

Examples

Run this code

# generate ordered random sample from GPD
set.seed(1977)
n <- 20
gam <- -0.75
x <- rgpd(n, gam)

# compute known endpoint
omega <- -1 / gam

# estimate log-concave density, i.e. generate dlc object
est <- logConDens(x, smoothed = FALSE, print = FALSE, gam = NULL, xs = NULL)

# plot distribution functions
s <- seq(0.01, max(x), by = 0.01)
plot(0, 0, type = 'n', ylim = c(0, 1), xlim = range(c(x, s))); rug(x)
lines(s, pgpd(s, gam), type = 'l', col = 2)
lines(x, 1:n / n, type = 's', col = 3)
lines(x, est$Fhat, type = 'l', col = 4)
legend(1, 0.4, c('true', 'empirical', 'estimated'), col = c(2 : 4), lty = 1)

# compute tail index estimators for all sensible indices k
falk.logcon <- falk(est)
falkMVUE.logcon <- falkMVUE(est, omega)
pick.logcon <- pickands(est)
genPick.logcon <- generalizedPick(est, c = 0.75, gam0 = -1/3)

# plot smoothed and unsmoothed estimators versus number of order statistics
plot(0, 0, type = 'n', xlim = c(0,n), ylim = c(-1, 0.2))
lines(1:n, pick.logcon[, 2], col = 1); lines(1:n, pick.logcon[, 3], col = 1, lty = 2)
lines(1:n, falk.logcon[, 2], col = 2); lines(1:n, falk.logcon[, 3], col = 2, lty = 2)
lines(1:n, falkMVUE.logcon[,2], col = 3); lines(1:n, falkMVUE.logcon[,3], col = 3, 
    lty = 2)
lines(1:n, genPick.logcon[, 2], col = 4); lines(1:n, genPick.logcon[, 3], col = 4, 
    lty = 2)
abline(h = gam, lty = 3)
legend(11, 0.2, c("Pickands", "Falk", "Falk MVUE", "Generalized Pickands'"), 
    lty = 1, col = 1:8)

Run the code above in your browser using DataLab