egp3RangeFit: Estimate the EGP3 distribution power parameter over a range of thresholds

Description

Estimate extended generalized Pareto distribution power parameter over a range of values, using maximum (penalized) likelihood.

Usage

egp3RangeFit(data, umin=quantile(data, .05), umax=quantile(data,
.95), nint = 10, penalty = "gaussian", priorParameters = NULL, alpha=0.05)
# S3 method for egp3RangeFit
print(x, ...)
# S3 method for egp3RangeFit
plot(x, xlab = "Threshold", ylab = "kappa", main = NULL, addNexcesses=TRUE, log.="", ...)
# S3 method for egp3RangeFit
ggplot(data, mapping, xlab = "Threshold", ylab = expression(kappa),
main=NULL,fill="orange", col="blue",addNexcesses=TRUE, textsize=4, ..., environment)

Arguments

data: The data vector to be modelled.
umin: The minimum threshold above which to estimate the parameters.
umax: The maximum threshold above which to estimate the parameters.
nint: The number of thresholds at which to perform the estimation.
penalty: The type of penalty to be used in the maximum penalized likelihood estimation. Should be either "gaussian" or "none". Defaults to "gaussian".
priorParameters: Parameters to be used for the penalty function. See the help for evm for more informaiton.
alpha: 100(1 - alpha)% confidence intervals will be plotted with the point estimates. Defaults to alpha = 0.05.
x: Argument to the print functions.
xlab: Label for the x-axis.
ylab: Label for the y-axis.
main: The main title.
textsize: Size of text for annotation showing number of threshold excesses.
addNexcesses: Annotate top axis with numbers of threshold excesses arising with the corresponding values of threshold on the bottom axis.
log.: Argument passed through to plot. Can take values "x" for plotting the x-axis on the log scale, "y" for plotting the y-axis on the log scale, "xy" for both, or "" (the default) for neither.
mapping, fill, col, environment: Arguments to ggplot method.
...: Arguments to plot.

Author

Harry Southworth

Details

Papastathopoulos and Tawn present 3 extended versions of the generalized Pareto distribution. Using the egp3 texmex family object, the power parameter in the EGP3 distribution is estimated on the log scale, a confidence interval is calculated and the result is transformed back to the scale of the power parameter and returned to the user.

When the power paramer, kappa, is equal to 1, the EPG3 distribution is identical to the generalized Pareto distribution. Therefore, the plot of the estimated parameter over a range of thresholds provides a diagnostic for threshold selection: the lowest value of kappa whose confidence interval includes 1 is suggested as the threshold for generalized Pareto modelling.

If lower thresholds are used and the EGP3 distribution itself is used for modelling, some care should be taken to ensure the model provides a reasonable degree of fit to the data. Limited experience suggests that such models seldom fit well and the main value of the EGP3 distribution is as a diagnostic for threshold selection as described here.

Note this function does not extend to assessing model fit when there are covariates included in the model.

References

I. Papastathopoulos and J. A. Tawn, Extended generalized Pareto modles for tail estimation, Journal of Statistical Planning and Inference, 143, 131 -- 143, 2013

Examples

Run this code


 # because of the time it takes to run
erf <- egp3RangeFit(rain)
plot(erf)
ggplot(erf)

Run the code above in your browser using DataLab