An Implementation of the procedure proposed in Caeiro & Gomes (2012) for selecting the optimal sample fraction in tail index estimation
Himp(data, B = 1000, epsilon = 0.955)
vector of sample data
number of Bootstrap replications
gives the amount of the first resampling size n1
by choosing n1 = n^epsilon
. Default is set to epsilon=0.955
gives an estimation of the second order parameter rho
.
optimal number of upper order statistics, i.e. number of exceedances or data in the tail
the corresponding threshold
the corresponding tail index
This procedure is an improvement of the one introduced in Hall (1990) by overcoming the restrictive assumptions through estimation of the necessary parameters. The Bootstrap procedure simulates the AMSE criterion of the Hill estimator using an auxiliary statistic. Minimizing this statistic gives a consistent estimator of the sample fraction k/n
with k
the optimal number of upper order statistics. This number, denoted k0
here, is equivalent to the number of extreme values or, if you wish, the number of exceedances in the context of a POT-model like the generalized Pareto distribution. k0
can then be associated with the unknown threshold u
of the GPD by choosing u
as the n-k0
th upper order statistic. For more information see references.
Hall, P. (1990). Using the Bootstrap to Estimate Mean Squared Error and Select Smoothing Parameter in Nonparametric Problems. Journal of Multivariate Analysis, 32, 177--203.
Caeiro, F. and Gomes, M.I. (2014). On the bootstrap methodology for the estimation of the tail sample fraction. Proceedings of COMPSTAT, 545--552.
# NOT RUN {
data(danish)
Himp(danish)
# }
Run the code above in your browser using DataLab