Learn R Programming

texmex (version 2.4.9)

extremalIndex: Extremal index estimation and automatic declustering

Description

Given a threshold which defines excesses above that threshold, estimate the extremal index of a dependent sequence by using the method of Ferro and Segers, 2003. The extremal index estimate can then be used to carry out automatic declustering of the sequence to identify independent clusters and estimate the GPD for cluster maxima. Graphical diagnostics of model fit are available.

Usage

extremalIndex(y, data = NULL, threshold)

extremalIndexRangeFit(y, data = NULL, umin = quantile(y,.5), umax = quantile(y, 0.95), nint = 10, nboot = 100, alpha = .05, estGPD=TRUE, verbose = TRUE, trace = 10, ...)

bootExtremalIndex(x)

declust(y, r=NULL, data = NULL, ...)

# S3 method for extremalIndex declust(y, r=NULL,...)

# S3 method for declustered plot(x, ylab = "Data",...)

# S3 method for declustered evm(y, data=NULL, family=gpd, ...)

# S3 method for extremalIndexRangeFit plot(x,addNexcesses=TRUE,estGPD=TRUE,...)

# S3 method for extremalIndex print(x,...)

# S3 method for declustered print(x,...)

# S3 method for extremalIndexRangeFit ggplot(data=NULL, mapping, xlab, ylab, main, ylim = "auto",ptcol="dark blue",col="dark blue",fill="orange", textsize=4,addNexcesses=TRUE,estGPD=TRUE,..., environment)

Value

The function extremalIndex returns a list of class "extremalIndex":

EIintervals

Estimate of the extremal index by using the intervals estimator of Ferro and Segers.

threshold

threshold for declustering and estimation

TotalN

length of original data series

nExceed

number of exceedances of threshold in original series.

thExceedanceProb

probablity of threshold exceedance in original series.

call

the original function call

interExceedTimes

times between threshold exceedances

thExceedances

observation from the original series which are above threshold

exceedanceTimes

times of occurrance of threshold exceedances

y

original dependent series

data

data frame or NULL

The function declust returns a list of type "declustered":

clusters

integer labels assigning threshold exceedances to clusters

sizes

number of exceedances in each cluster

clusterMaxima

vector made up of the largest observation from each distinct cluster. In the case of ties, the first value is taken.

isClusterMax

logical; length equal to number of threshold exceedances, value is TRUE for threshold exceedances which correspond to cluster maxima

y

see entry for object of class "extremalIndex" above

data

see entry for object of class "extremalIndex" above

threshold

see entry for object of class "extremalIndex" above

EIintervals

see entry for object of class "extremalIndex" above

call

see entry for object of class "extremalIndex" above

InterExceedTimes

times between threshold exceedances, length is one less than the number of threshold exceedances

InterCluster

logical: indicates inter exceedance times larger than r the run length used for declustering

thExceedances

see entry for object of class "extremalIndex" above

exceedanceTimes

see entry for object of class "extremalIndex" above

r

run length used for declustering

nClusters

Number of indenendent clusters identified

method

Method used for declustering (either "intervals" or "runs")

The function bootExtremalIndex return a single vector corersponding to a bootstrap sample from the original series: observations are censored at threshold so that values below this threshold are indicated by the value -1.

The method evm for class "declustered" returns an object of type "evmOpt" or "evmSim" depending on the precise function call - see documentation for evm.

Arguments

y

Argument to function extremalIndex: either a numeric vector or the name of a variable in data.

data

A data frame containing y and any covariates. In evm.declustered, it should be NULL and is included to match the arguments of generic evm.

threshold

The threshold for y, exceedances above which will be used to estimate the extremal index and carry out automatic declustering.

family

The type of extreme value model. The user should not change this from its default in evm.declustered.

x

Objects passed to methods.

r

Positivie integer: run length to be used under "runs" declustering. If specified then so-called "runs" declustering will be carried out, otherwise defaults to NULL in which case the automatic "intervals" declustering method of Ferro and Segers is used.

umin

The minimum threshold above which to esimate the parameters.

umax

The maximum threshold above which to esimate the parameters.

nint

The number of thresholds at which to perform the estimation.

nboot

Number of bootstrap samples to simulate at each threshold for estimation.

alpha

100(1 - alpha)% confidence intervals will be plotted with the point estimates. Defaults to alpha = 0.05.

xlab

Label for the x-axis (ggplot).

ylab

Label for the y-axis (ggplot).

addNexcesses

Whether to annotate the top axis of plots with the number of excesses above the corresponding threhsold. Defaults to TRUE.

estGPD

Whether to estimate GPD parameters at each choice of thereshold -- defaults to TRUE in which case the GPD parameters are estimated.

verbose

Whether to report on progress in RangeFit calculations. Defaults to TRUE.

trace

How frequently to report bootstrap progress in RangeFit calculations. Defaults to 10.

mapping, main, ylim, ptcol, col, fill, textsize, environment

Further arguments to ggplot method.

...

Further arguments to methods.

Author

Janet E. Heffernan

Details

The function extremalIndex estimates the extremal index of a dependent series of observations above a given threshold threshold, returning an object of class "extremalIndex". Plot and print methods are available for this class. A graphical diagnostic akin to Figure 1 in Ferro and Segers (2003) is produced by the plot method for this class. This plot is used to test the model assumption underpinning the estimation, with good fit being indicated by interexceedance times which correspond to inter-cluster times lying close to the diagonal line indicated.

In addition to good model fit, an appropriate choice of threshold is one above which the estimated extremal index is stable over further, higher thresholds (up to estimation uncertainty). This can be assessed by using the function extremalIndexRangeFit, which examines a range of threshold values. At each threshold, the extremal index is estimated; that estimate is used to decluster the series and the parameters of the GPD are optionally estimated for the resulting declustered series. Uncertainty in the estimation of the extremal index and GPD parameters is assessed by using a bootstrap scheme which accounts for uncertainty in the extremal index estimation, and the corresponding uncertainty in the declustering of the series. There are plot and ggplot methods for output of this function, which is of class extremalIndexRangeFit.

The function declust returns an object of class "declustered", identifying independent clusters in the original series. Print, plot and show methods are available for this class. The GPD model can be fitted to objects of this class, including the use of covariates in the linear predictors for the parameters of the GPD. See examples below.

References

Ferro, C.A.T. and Segers, J., (2003) "Inference for clusters of Extreme Values", JRSS B 65, Part 2, pp 545--556.

See Also

evm

Examples

Run this code

par(mfrow=c(2,2));
extremalIndexRangeFit(summer$O3,nboot=10)
ei <- extremalIndex(summer$O3,threshold=45)
plot(ei)
d <- declust(ei)
plot(d)
evm(d)

## fitting with covariates:

so2 <- extremalIndex(SO2,data=winter,threshold=15)
plot(so2)
so2 <- extremalIndex(SO2,data=winter,threshold=20)
plot(so2) ## fits better

so2.d <- declust(so2)
par(mfrow=c(1,1)); plot(so2.d)
so2.d.gpd <- evm(so2.d) # AIC 661.1

evm(so2.d,phi=~NO)
evm(so2.d,phi=~NO2)
evm(so2.d,phi=~O3) # better AIC 651.9
evm(so2.d,phi=~PM10)

so2.d.gpd.o3 <- evm(so2.d,phi=~O3)

par(mfrow=c(2,2)); plot(so2.d.gpd.o3)

Run the code above in your browser using DataLab