dfs_pick: Pickands frontier estimator

Description

This function is an implementation of the Pickands type estimator developed by Daouia, Florens and Simar (2010).

Usage

dfs_pick(xtab, ytab, x, k, rho, ci=TRUE)

Value

Returns a numeric vector with the same length as x.

Arguments

xtab: a numeric vector containing the observed inputs $x_1,\ldots,x_n$.
ytab: a numeric vector of the same length as xtab containing the observed outputs $y_1,\ldots,y_n$.
x: a numeric vector of evaluation points in which the estimator is to be computed.
k: a numeric vector of the same length as x or a scalar, which determines the thresholds at which the Pickands estimator will be computed.
rho: a numeric vector of the same length as x or a scalar, which determines the values of rho.
ci: a boolean, TRUE for computing the confidence interval.

Author

Abdelaati Daouia and Thibault Laurent (converted from Leopold Simar's Matlab code).

Details

Built on the ideas of Dekkers and de Haan (1989), Daouia et al. (2010) propose to estimate the frontier point $\varphi(x)$ by $$\hat\varphi_{pick}(x) = \frac{z^x_{(n-k+1)} - z^x_{(n-2k+1)}}{2^{1/\rho_x} - 1} + z^x_{(n-k+1)}$$ from the transformed data $\{z^{x}_i, \,i=1,\cdots,n\}$ described in dfs_momt, where $\rho_x>0$ is the same tail-index as in dfs_momt. If $\rho_x$ is known (typically equal to 2 if the joint density of data is believed to have sudden jumps at the frontier), then one can use the estimator $\hat\varphi_{pick}(x)$ in conjunction with the function kopt_momt_pick which implements an automatic data-driven method for selecting the threshold $k$. In contrast, if $\rho_x$ is unknown, one could consider using the following two-step estimator: First, estimate $\rho_x$ by the Pickands estimator $\hat\rho_x$ implemented in the function rho_momt_pick by using the option method="pickands", or by the moment estimator $\tilde\rho_x$ by utilizing the option method="moment". Second, use the estimator $\hat\varphi_{pick}(x)$, as if $\rho_x$ were known, by substituting the estimated value $\hat\rho_x$ or $\tilde\rho_x$ in place of $\rho_x$. The pointwise $95\%$ confidence interval of the frontier function obtained from the asymptotic normality of $\hat\varphi_{pick}(x)$ is given by $$[\hat\varphi_{pick}(x) \pm 1.96 \sqrt{v(\rho_x) / (2 k)} ( z^x_{(n-k+1)} - z^x_{(n-2k+1)})]$$ where $v(\rho_x) =\rho^{-2}_x 2^{-2/\rho_x}/(2^{-1/\rho_x} -1)^4$. Finally, to select the threshold $k=k_n(x)$, one could use the automatic data-driven method of Daouia et al. (2010) implemented in the function kopt_momt_pick (option method="pickands").

References

Daouia, A., Florens, J.P. and Simar, L. (2010). Frontier Estimation and Extreme Value Theory, Bernoulli, 16, 1039-1063.

Dekkers, A.L.M., Einmahl, J.H.J. and L. de Haan (1989), A moment estimator for the index of an extreme-value distribution, Annals of Statistics, 17, 1833-1855.

Examples

Run this code

data("post")
x.post<- seq(post$xinput[100],max(post$xinput), 
 length.out=100) 
# 1. When rho[x] is known and equal to 2, we set:
rho<-2
# To determine the sample fraction k=k[n](x) 
# in hat(varphi[pick])(x).
best_kn.1<-kopt_momt_pick(post$xinput, post$yprod, 
 x.post, method="pickands", rho=rho)
# To compute the frontier estimates and confidence intervals:  
res.pick.1<-dfs_pick(post$xinput, post$yprod, x.post, 
 rho=rho, k=best_kn.1)
# Representation
plot(yprod~xinput, data=post, xlab="Quantity of labor", 
 ylab="Volume of delivered mail")
lines(x.post, res.pick.1[,1], lty=1, col="cyan")  
lines(x.post, res.pick.1[,2], lty=3, col="magenta")  
lines(x.post, res.pick.1[,3], lty=3, col="magenta")  

if (FALSE) {
# 2. rho[x] is unknown and estimated by 
# the Pickands estimator hat(rho[x])
rho_pick<-rho_momt_pick(post$xinput, post$yprod, 
 x.post, method="pickands")
best_kn.2<-kopt_momt_pick(post$xinput, post$yprod,
  x.post, method="pickands", rho=rho_pick)
res.pick.2<-dfs_pick(post$xinput, post$yprod, x.post, 
 rho=rho_pick, k=best_kn.2)  
# 3. rho[x] is unknown independent of x and estimated
# by the (trimmed) mean of hat(rho[x])
rho_trimmean<-mean(rho_pick, trim=0.00)
best_kn.3<-kopt_momt_pick(post$xinput, post$yprod,
  x.post, rho=rho_trimmean, method="pickands")   
res.pick.3<-dfs_pick(post$xinput, post$yprod, x.post, 
 rho=rho_trimmean, k=best_kn.3)  

# Representation 
plot(yprod~xinput, data=post, col="grey", xlab="Quantity of labor", 
 ylab="Volume of delivered mail")
lines(x.post, res.pick.2[,1], lty=1, lwd=2, col="cyan")  
lines(x.post, res.pick.2[,2], lty=3, lwd=4, col="magenta")  
lines(x.post, res.pick.2[,3], lty=3, lwd=4, col="magenta")  
plot(yprod~xinput, data=post, col="grey", xlab="Quantity of labor", 
 ylab="Volume of delivered mail")
lines(x.post, res.pick.3[,1], lty=1, lwd=2, col="cyan")  
lines(x.post, res.pick.3[,2], lty=3, lwd=4, col="magenta")  
lines(x.post, res.pick.3[,3], lty=3, lwd=4, col="magenta") 
}

Run the code above in your browser using DataLab