Fit multiple independent generalized Pareto models as the first step of conditional multivariate extreme values modelling following the approach of Heffernan and Tawn, 2004.
# S3 method for migpd
ggplot(
data,
mapping = NULL,
main = c("Probability plot", "Quantile plot", "Return level plot",
"Histogram and density"),
xlab = rep(NULL, 4),
nsim = 1000,
alpha = 0.05,
...,
environment
)migpd(
data,
mth,
mqu,
penalty = "gaussian",
maxit = 10000,
trace = 0,
verbose = FALSE,
priorParameters = NULL,
cov = "observed",
family = gpd
)
# S3 method for migpd
plot(
x,
main = c("Probability plot", "Quantile plot", "Return level plot",
"Histogram and density"),
xlab = rep(NULL, 4),
nsim = 1000,
alpha = 0.05,
...
)
An object of class "migpd". There are coef
, print
,
plot
, ggplot
and summary
functions available.
A matrix or data.frame, each column of which is to be modelled.
Further arguments to ggplot method.
Character vector of length four: titles for plots produced by
plot
and ggplot
methods.
As main
but for x-axes labels.
Number of simulations on which to base tolerance envelopes in
plot
and ggplot
methods.
Significance level for tolerance and confidence intervals in
plot
and ggplot
methods.
Further arguments to be passed to methods.
Marginal thresholds. Thresholds above which to fit the models.
Only one of mth
and mqu
should be supplied. Length one (in
which case a common threshold is used) or length equal to the number of
columns of data
(in which case values correspond to thresholds for
each of the columns respectively).
Marginal quantiles. Quantiles above which to fit the models. Only
one of mth
and mqu
should be supplied. Length as for
mth
above.
How the likelihood should be penalized. Defaults to
"gaussian". See documentation for evm
.
The maximum number of iterations to be used by the optimizer.
Whether or not to tell the user how the optimizer is getting
on. The argument is passed into optim
-- see the help for that
function.
Controls whether or not the function prints to screen every time it fits a model. Defaults to FALSE.
Only used if penalty = 'gaussian'
. A named
list, each element of which contains two components: the first should be a
vector of length 2 corresponding to the location of the Gaussian
distribution; the second should be 2x2 matrix corresponding to the
covariance matrix of the distribution. The names should match the names of
the columns of data
. If not provided, it defaults to independent
priors being centred at zero, with variance 10000 for log(sigma) and 0.25
for xi. See the details section.
String, passed through to evm
: how to estimate the covariance.
Defaults to cov = "observed"
.
An object of class "texmexFamily". Should be either
family = gpd
or family = cgpd
and defaults to the first of those.
Object of class migpd
as returned by function migpd
.
Harry Southworth
The parameters in the generalized Pareto distribution are estimated for each column of the data in turn, independently of all other columns. Note, covariate modelling of GPD parameters is not supported.
Maximum likelihood estimation often fails with generalized Pareto distributions because of the likelihood becoming flat (see, for example, Hosking et al, 1985). Therefore the function allows penalized likelihood estimation, which is the same as maximum a posteriori estimation from a Bayesian point of view.
By default quadratic penalization is used, corresponding to using a Gaussian prior. If no genuine prior information is available, the following argument can be used. If xi = -1, the generalized Pareto distribution corresponds to the uniform distribution, and if xi is 1 or greater, the expectation is infinite. Thefore, xi is likely to fall in the region (-1, 1). A Gaussian distribution centred at zero and with standard deviation 0.5 will have little mass outside of (-1, 1) and so will often be a reasonable prior for xi. For log(sigma) a Gaussian distribution, centred at zero and with standard deviation 100 will often be vague. If a Gaussian penalty is specified but no parameters are given, the function will assume such indpendent priors.
Note that internally the function works with log(sigma), not sigma. The reasons are that quadratic penalization makes more sense for phi=log(sigma) than for sigma (because the distribution of log(sigma) will be more nearly symmetric), and because it was found to stabilize computations.
The associated coef
, print
and summary
functions
exponentiate the log(sigma) parameter to return results on the expected
scale. If you are accessesing the parameters directly, however, take care to
be sure what scale the results are on.
Threshold selection can be carried out with the help of functions
mrl
and gpdRangeFit
.
J. E. Heffernan and J. A. Tawn, A conditional approach for multivariate extreme values, Journal of the Royal Statistical society B, 66, 497 -- 546, 2004
J. R. M. Hosking and J. R. Wallis, Parameter and quantile estimation for the generalized Pareto distribution, Technometrics, 29, 339 -- 349, 1987
mex
, mexDependence
,
bootmex
, predict.mex
, gpdRangeFit
,
mrl
# \donttest{
mygpd <- migpd(winter, mqu=.7, penalty = "none")
mygpd
summary(mygpd)
plot(mygpd)
g <- ggplot(mygpd)
# }
Run the code above in your browser using DataLab