Fit the conditional multivariate extreme value model of Heffernan and Tawn
# S3 method for mex
ggplot(
data = NULL,
mapping,
ptcol = "blue",
col = "cornflowerblue",
fill = "orange",
plot. = TRUE,
quantiles = seq(0.1, by = 0.2, len = 5),
...,
environment
)mex(
data,
which,
mth,
mqu,
dqu,
cov = "numeric",
family = gpd,
margins = "laplace",
constrain = TRUE,
v = 10,
penalty = "gaussian",
maxit = 10000,
trace = 0,
verbose = FALSE,
priorParameters = NULL
)
mexAll(data, mqu, dqu)
# S3 method for mexList
print(x, ...)
# S3 method for mex
plot(x, quantiles = seq(0.1, by = 0.2, len = 5), col = "grey", ...)
# S3 method for predict.mex
plot(x, pch = c(1, 3, 20), col = c(2, 8, 3), cex = c(1, 1, 1), ask = TRUE, ...)
# S3 method for predict.mex
ggplot(
data = NULL,
mapping,
xlab,
ylab,
main,
ptcol = c("grey", "dark blue", "orange"),
col = "dark blue",
fill = "orange",
shape = 16:18,
size = rep(1, 3),
plot. = TRUE,
...,
environment
)
# S3 method for mex
predict(
object,
which,
pqu = 0.99,
nsim = 1000,
trace = 10,
smoothZdistribution = FALSE,
...
)
# S3 method for predict.mex
summary(object, mth, probs = c(0.05, 0.5, 0.95), ...)
A call to mex
returns an list of class mex
containing
the following three items:
An object of class
migpd
.
An object of class
mexDependence
.
This matches the original function call.
There are plot
, summary
, coef
and predict
methods for this class.
A call to predict.mex
does the importance sampling for prediction,
and returns a list of class "predict.mex"
for which there are print
and plot methods available. The summary method for this class of object is
intended to be used following a call to the predict method, to estimate
quantiles or probabilities of threshold excesses for the fitted conditional
distributions given the conditioning variable above the threshold for
prediction. See examples below.
There are print
, summary
and plot
methods available for
the class "predict.mex".
A numeric matrix or data.frame, the columns of which are to be modelled.
In plot
method for objects of class mex
, the colour
for points on scatterplots of residuals and original data respectively. In
plot
method for objects of class predict.mex
, the colours of
points for observed, and simulated data (conditioning variable not the
largest) and simulated data (conditioning variable is the largest)
respectively.
A vector of quantiles taking values between 0 and 1 specifying the quantiles of the conditional distributions which will be plotted.
Further arguments to be passed to methods.
The variable on which to condition. This can be either scalar, indicating the column number of the conditioning variable, or character, giving the column name of the conditioning variable.
Marginal thresholds. In mex
, the threshold above which to
fit generalized Pareto distributions. If this is a vector of length 1, the
same threshold will be used for each variable. Otherwise, it should be a
vector whose length is equal to the number of columns in data
.
In summary.predict.mex
, the thresholds over which to simulate data
from the fitted multivariate model. If not supplied, it is taken to be the
thresholds that were used to fit the dependence model on the scale of the
original data.
Marginal quantiles As an alternative to specifying the marginal
GPD fitting thresholds via mth
, you can specify the quantile (a
probability) above which to fit generalized Pareto distributions. If this is
a vector of length 1, the same quantile will be used for each variable.
Otherwise, it should be a vector whose length is equal to the number of
columns in data
.
Dependence quantile. Used to specify the quantile at which to
threshold the conditioning variable data when estimating the dependence
parameters. For example dqu=0.7
will result in the data with the
highest 30% of values of the conditioning variable being used to estimate
the dependence parameters. The same threshold will be used for each
dependent variable. If not supplied then the default is to set
dqu=mqu[which]
the quantile corresponding to the threshold used to
fit the marginal model to the tail of the conditioning variable. Note that
there is no requirement for the quantiles used for marginal fitting
(mqu
) and dependence fitting (dqu
) to be the same, or for them
to be ordered in any way.
String, passed through to evm
: how to estimate the covariance.
Defaults to cov = "observed"
.
An object of class "texmexFamily". Should be either
family = gpd
or family = cgpd
and defaults to the first of those.
See documentation for mexDependence
.
See documentation for mexDependence
.
See documentation for mexDependence
.
How to penalize the likelihood when estimating the marginal
generalized Pareto distributions. Defaults to ``gaussian''. See the help
file for evm
for more information.
The maximum number of iterations to be used by the optimizer.
defaults to maxit = 10000
.
Passed internally to optim
. Whether or not to
inform the user of the progress of the optimizer. Defaults to 0, indicating
no trace.
Whether or not to keep the user informed of progress.
Defaults to verbose = FALSE
.
Parameters of prior/penalty used for estimation of
the GPD parameters. This is only used if penalty = "gaussian"
. It
is a named list, each element of which contains two components: the first
component should be a vector of length 2 corresponding to the location of
the Gaussian distribution; the second a 2x2 matrix corresponding to the
covariance matrix of the distribution. The names should match the names of
the columns of data
. If not provided, the default priors are
independent normal, centred at zero, with variance 10000 for phi=log(sigma)
and 0.25 for xi. See the details section.
Object of class mex
or summary.mex
as returned by these functions respectively.
Plotting characters: colours and symbol expansion. The
observed and simulated data are plotted using different symbols, controlled
by these arguments and col
, each of which should be of length 2.
Whether or not to ask before changing the plot. Defaults to
ask = TRUE
.
Further arguments to plot and ggplot methods.
Prediction quantile. Argument to predict
method. The
quantile of the conditioning variable above which it will be simulated for
importance sampling based prediction. Defaults to pqu = .99
.
Argument to predict
method. The number of simulated
observations to be generated for prediction.
In predict.mex
, whether or not to sample
from the smoothed distribution of the underlying residuals. Defaults to
FALSE
, in which case no smoothing is carried out. If TRUE
then each margin of the underlying multivariate residual is smoothed
independently, by using kernel smoothing with a normal kernel, and bandwith
chosen using the bw.nrd
function. This can be useful for
removing "stripeyness" in importance samples which have few values in the
conditional tails.
In summary
method for objects of class
predict.mex
: the quantiles of the conditional distribution(s) to
calculate. Defaults to 5%, 50% and 95%.
Harry Southworth, Janet E. Heffernan
The function mex
works as follows. First, Generalized Pareto
distributions (GPD) are fitted to the upper tails of each of the marginal
distributions of the data: the GPD parameters are estimated for each column
of the data in turn, independently of all other columns. Then, the
conditional multivariate approach of Heffernan and Tawn is used to model the
dependence between variables. The returned object is of class "mex".
This function is a wrapper for calls to migpd
and
mexDependence
, which estimate parameters of the marginal and
dependence components of the Heffernan and Tawn model respectively. See
documentation of these functions for details of modelling issues including
the use of penalties / priors, threshold choice and checking for convergence
of parameter estimates.
The plot
method produces diagnostic plots for the fitted dependence
model described by Heffernan and Tawn, 2004. The plots are best viewed by
using the plotting area split by par(mfcol=c(.,.))
rather than
mfrow
, see examples below. Three diagnostic plots are produced for
each dependent variable:
1) Scatterplots of the residuals Z from the fitted model of Heffernan and
Tawn (2004) are plotted against the quantile of the conditioning variable,
with a lowess curve showing the local mean of these points. 2) The absolute
value of Z-mean(Z)
is also plotted, again with the lowess curve
showing the local mean of these points. Any trend in the location or
scatter of these variables with the conditioning variable indicates a
violation of the model assumption that the residuals Z are indepenendent of
the conditioning variable. This can be indicative of the dependence
threshold used being too low. 3) The final plots show the original data (on
the original scale) and the fitted quantiles (specified by quantiles
)
of the conditional distribution of each dependent variable given the
conditioning variable. A model that fits well will have good agreement
between the distribution of the raw data (shown by the scatter plot) and the
fitted quantiles. Note that the raw data are a sample from the joint
distribution, whereas the quantiles are those of the estimated conditional
distribution given the value of the conditioning variable, and while these
two distributions should move into the same part of the sample space as the
conditioning variable becomes more extreme, they are not the same thing!
The predict
method for mex
works as follows. The returned
object has class "predict.mex". Simulated values of the dependent variables
are created, given that the conditioning variable is above its 100pqu
quantile. If predict.mex
is passed an object object
of class
"mex"
then the simulated values are based only on the point estimate
of the dependence model parameters, and the original data. If
predict.mex
is passed an object object
of class
"bootmex"
then the returned value additionally contains simulated
replicate data sets corresponding to the bootstrap model parameter
estimates. In both cases, the simulated values based on the original data
and point estimates appear in component object$data$simulated
. The
simulated data from the bootstrap estimates appear in
object$replicates
.
The plot
method for class "predict.mex"
displays both the
original data and the simulated data generated above the threshold for
prediction; it shows the threshold for prediction (vertical line) and also
the curve joining equal quantiles of the marginal distributions -- this is
for reference: variables that are perfectly dependent will lie exactly on
this curve. Original data are shown with one plotting character and
simulated data with another; colours of simulated point distinguish those
points which have the conditioning variable as the largest (on a quantile
scale) or not the largest.
The function mexAll
fits a collection of GPD and conditional
dependence models, the same fitted GPD being used for all of the dependence
model fits. This can be used in turn to generate Monte Carlo samples from
the entire sample space usign the collected dependence models.
J. E. Heffernan and J. A. Tawn, A conditional approach for multivariate extreme values, Journal of the Royal Statistical Society B, 66, 497 - 546, 2004
migpd
, mexDependence
,
bootmex
, mexMonteCarlo
w <- mex(winter, mqu=.7, dqu=0.7, which="O3")
w
par(mfcol=c(3, 2))
plot(w)
par(mfcol=c(2,2))
p <- predict(w)
summary(p)
summary(p,probs=c(0.01,0.2,0.5,0.8,0.99))
summary(p,probs=0.5,mth=c(40,50,150,25,50))
p
plot(p)
Run the code above in your browser using DataLab