ggplot.migpd: Fit multiple independent generalized Pareto models

Description

Fit multiple independent generalized Pareto models as the first step of conditional multivariate extreme values modelling following the approach of Heffernan and Tawn, 2004.

Usage

# S3 method for migpd
ggplot(
  data,
  mapping = NULL,
  main = c("Probability plot", "Quantile plot", "Return level plot",
    "Histogram and density"),
  xlab = rep(NULL, 4),
  nsim = 1000,
  alpha = 0.05,
  ...,
  environment
)
migpd(
  data,
  mth,
  mqu,
  penalty = "gaussian",
  maxit = 10000,
  trace = 0,
  verbose = FALSE,
  priorParameters = NULL,
  cov = "observed",
  family = gpd
)
# S3 method for migpd
plot(
  x,
  main = c("Probability plot", "Quantile plot", "Return level plot",
    "Histogram and density"),
  xlab = rep(NULL, 4),
  nsim = 1000,
  alpha = 0.05,
  ...
)

Value

An object of class "migpd". There are coef, print, plot, ggplot and summary functions available.

Arguments

data: A matrix or data.frame, each column of which is to be modelled.
mapping, environment: Further arguments to ggplot method.
main: Character vector of length four: titles for plots produced by plot and ggplot methods.
xlab: As main but for x-axes labels.
nsim: Number of simulations on which to base tolerance envelopes in plot and ggplot methods.
alpha: Significance level for tolerance and confidence intervals in plot and ggplot methods.
...: Further arguments to be passed to methods.
mth: Marginal thresholds. Thresholds above which to fit the models. Only one of mth and mqu should be supplied. Length one (in which case a common threshold is used) or length equal to the number of columns of data (in which case values correspond to thresholds for each of the columns respectively).
mqu: Marginal quantiles. Quantiles above which to fit the models. Only one of mth and mqu should be supplied. Length as for mth above.
penalty: How the likelihood should be penalized. Defaults to "gaussian". See documentation for evm.
maxit: The maximum number of iterations to be used by the optimizer.
trace: Whether or not to tell the user how the optimizer is getting on. The argument is passed into optim -- see the help for that function.
verbose: Controls whether or not the function prints to screen every time it fits a model. Defaults to FALSE.
priorParameters: Only used if penalty = 'gaussian'. A named list, each element of which contains two components: the first should be a vector of length 2 corresponding to the location of the Gaussian distribution; the second should be 2x2 matrix corresponding to the covariance matrix of the distribution. The names should match the names of the columns of data. If not provided, it defaults to independent priors being centred at zero, with variance 10000 for log(sigma) and 0.25 for xi. See the details section.
cov: String, passed through to evm: how to estimate the covariance. Defaults to cov = "observed".
family: An object of class "texmexFamily". Should be either family = gpd or family = cgpd and defaults to the first of those.
x: Object of class migpd as returned by function migpd.

Author

Harry Southworth

Details

The parameters in the generalized Pareto distribution are estimated for each column of the data in turn, independently of all other columns. Note, covariate modelling of GPD parameters is not supported.

Maximum likelihood estimation often fails with generalized Pareto distributions because of the likelihood becoming flat (see, for example, Hosking et al, 1985). Therefore the function allows penalized likelihood estimation, which is the same as maximum a posteriori estimation from a Bayesian point of view.

By default quadratic penalization is used, corresponding to using a Gaussian prior. If no genuine prior information is available, the following argument can be used. If xi = -1, the generalized Pareto distribution corresponds to the uniform distribution, and if xi is 1 or greater, the expectation is infinite. Thefore, xi is likely to fall in the region (-1, 1). A Gaussian distribution centred at zero and with standard deviation 0.5 will have little mass outside of (-1, 1) and so will often be a reasonable prior for xi. For log(sigma) a Gaussian distribution, centred at zero and with standard deviation 100 will often be vague. If a Gaussian penalty is specified but no parameters are given, the function will assume such indpendent priors.

Note that internally the function works with log(sigma), not sigma. The reasons are that quadratic penalization makes more sense for phi=log(sigma) than for sigma (because the distribution of log(sigma) will be more nearly symmetric), and because it was found to stabilize computations.

The associated coef, print and summary functions exponentiate the log(sigma) parameter to return results on the expected scale. If you are accessesing the parameters directly, however, take care to be sure what scale the results are on.

Threshold selection can be carried out with the help of functions mrl and gpdRangeFit.

References

J. E. Heffernan and J. A. Tawn, A conditional approach for multivariate extreme values, Journal of the Royal Statistical society B, 66, 497 -- 546, 2004

J. R. M. Hosking and J. R. Wallis, Parameter and quantile estimation for the generalized Pareto distribution, Technometrics, 29, 339 -- 349, 1987

Examples

Run this code


# \donttest{
mygpd <- migpd(winter, mqu=.7, penalty = "none")
mygpd
summary(mygpd)
plot(mygpd)
g <- ggplot(mygpd)
# }

Run the code above in your browser using DataLab