fitmcgpd: Fitting Markov Chain Models to Peaks Over a Threshold

Description

Fitting a Markov chain to cluster exceedances using a bivariate extreme value distribution and a censored maximum likelihood procedure.

Usage

fitmcgpd(data, threshold, model = "log", start, ..., std.err.type =
"observed", corr = FALSE, warn.inf = TRUE, method = "BFGS")

Value

The function returns an object of class c("mcpot", "uvpot", "pot"). As usual, one can extract several features using

fitted (or fitted.values),

deviance, logLik and AIC

functions.

fitted.values: The maximum likelihood estimates of the Markov chain including estimated parameters of the bivariate extreme value distribution.
std.err: A vector containing the standard errors - only present when the observed information matrix is not singular.
var.cov: The asymptotic variance covariance matrix - only presents when the observed information matrix is not singular.
deviance: The deviance.
corr: The correlation matrix.
convergence, counts, message: Informations taken from the optim function.
threshold: The threshold.
pat: The proportion above the threshold.
nat: The number above the threshold.
data: The observations.
exceed: The exceedances.
call: The call of the current function.
model: The model for the bivariate extreme value distribution.
chi: The chi statistic of Coles (1999). A value near 1 (resp. 0) indicates perfect dependence (resp. independence).

Arguments

data: A vector of observations.
threshold: The threshold value.
model: A character string which specifies the model used. Must be one of log (the default), alog, nlog, anlog, mix and amix for the logistic, asymmetric logistic, negative logistic, asymmetric negative logistic, mixed and asymmetric mixed models.
start: Optional. A list for starting values in the fitting procedure.
...: Additional parameters to be passed to the optim function or to the bivariate model. In particular, parameter of the model can be hand fixed.
std.err.type: The type of the standard error. Currently, one must specify ``observed'' for observed Fisher information matrix or ``none'' for no computations of the standard errors.
corr: Logical. Should the correlation matrix be computed?
warn.inf: Logical. Should users be warned if likelihood is not finite at starting values?
method: The optimization method, see optim.

Warnings

Because of numerical problems, there exists artificial numerical constraints imposed on each model. These are:

For the logistic and asymmetric logistic models: $\alpha$ must lie in [0.05, 1] instead of [0,1];
For the negative logistic model: $\alpha$ must lie in [0.01, 15] instead of $[0,\infty[$;
For the asymmetric negative logistic model: $\alpha$ must lie in [0.2, 15] instead of $[0,\infty[$;
For the mixed and asymmetric mixed models: None artificial numerical constraints are imposed.

For this purpose, users must check if estimates are near these artificial numerical constraints. Such cases may lead to substantial biases on the GP parameter estimates. One way to detect quickly if estimates are near the border constraints is to look at the standard errors for the dependence parameters. Small values (i.e. < 1e-5) often indicates that numerical constraints have been reached.

In addition, users must be aware that the mixed and asymmetric mixed models can not deal with perfect dependence.

Thus, user may want to plot the Pickands' dependence function to see if variable are near independence or dependence cases using the pickdep function.

In addition, we recommend to fix the marginal parameters. Indeed, even this is a two steps optimization procedure, this avoid numerical troubles - the likelihood function for the Markov chain model seems to be problematic. Thus, estimates are often better using the two stages approach.

Author

Mathieu Ribatet

Details

The Markov Chain model is defined as follows: $$L\left(y;\theta_1,\theta_2\right) = f\left(x_1; \theta_1\right) \prod_{i=2}^n f\left(y_i | y_{i-1};\theta_1,\theta_2\right)$$

As exceedances above a (high enough) threshold are of interest, it is assumed that the marginal are GPD distributed, while the joint distribution is represented by a bivariate extreme value distribution. Smith et al. (1997) present theoretical results about this Markov Chain model.

The bivariate exceedances are fitted using censored likelihood procedure. This methodology is fully described in Ledford (1996).

Most of models are described in Kluppelberg (2006).

References

Kl\"uppelberg, C., and May A. (2006) Bivariate extreme value distributions based on polynomial dependence functions. Mathematical Methods in the Applied Sciences, 29 1467--1480.

Ledford A., and Tawn, J. (1996) Statistics for near Independence in Multivariate Extreme Values. Biometrika, 83 169--187.

Smith, R., and Tawn, J., and Coles, S. (1997) Markov chain models for threshold exceedances. Biometrika, 84 249--268

Examples

Run this code

mc <- simmc(1000, alpha = 0.25)
mc <- qgpd(mc, 0, 1, 0.25)
##A first application when marginal parameter are estimated
fitmcgpd(mc, 0)

##Another one where marginal parameters are fixed
fmle <- fitgpd(mc, 0)
fitmcgpd(mc, 0, scale = fmle$param["scale"], shape = fmle$param["shape"])

Run the code above in your browser using DataLab