The classic four diagnostic plots for evaluating extreme value mixture models: 1) return level plot, 2) Q-Q plot, 3) P-P plot and 4) density plot. Each plot is available individually or as the usual 2x2 collection.
evmix.diag(modelfit, upperfocus = TRUE, alpha = 0.05, N = 1000,
legend = FALSE, ...)rlplot(modelfit, upperfocus = TRUE, alpha = 0.05, N = 1000,
legend = TRUE, rplim = NULL, rllim = NULL, ...)
qplot(modelfit, upperfocus = TRUE, alpha = 0.05, N = 1000,
legend = TRUE, ...)
pplot(modelfit, upperfocus = TRUE, alpha = 0.05, N = 1000,
legend = TRUE, ...)
densplot(modelfit, upperfocus = TRUE, legend = TRUE, ...)
fitted extreme value mixture model object
logical, should plot focus on upper tail?
significance level over range (0, 1), or NULL
for no CI
number of Monte Carlo simulation for CI (N>=10)
logical, should legend be included
further arguments to be passed to the plotting functions
return period range
return level range
rlplot
gives the return level plot,
qplot
gives the Q-Q plot,
pplot
gives the P-P plot,
densplot
gives density plot and
evmix.diag
gives the collection of all 4.
Based on the GPD/POT diagnostic function plot.uvevd
in the evd
package for which Stuart Coles' and Alec Stephenson's
contributions are gratefully acknowledged.
They are designed to have similar syntax and functionality to simplify the transition for users of these packages.
Model diagnostics are available for all the fitted extreme mixture models in the
evmix
package. These modelfit
is output by all the fitting
functions, e.g. fgpd
and fnormgpd
.
Consistent with plot
function in the
evd
library the ppoints
to
estimate the empirical cumulative probabilities. The default behaviour of this
function is to use $$(i-0.5)/n$$ as the estimate for the \(i\)th order statistic of
the given sample of size \(n\).
The return level plot has the quantile (\(q\) where \(P(X \ge q)=p\) on
the \(y\)-axis, for a particular survival probability \(p\). The return period
\(t=1/p\) is shown on the \(x\)-axis. The return level is given by:
$$q = u + \sigma_u [(\phi_u t)^\xi - 1]/\xi$$
for \(\xi\ne 0\). But in the case of \(\xi = 0\) this simplifies to
$$q = u + \sigma_u log(\phi_u t)$$
which is linear when plotted against the return period on a logarithmic scale. The special
case of exponential/Type I (\(\xi=0\)) upper tail behaviour will be linear on
this scale. This is the same tranformation as in the GPD/POT diagnostic plot function
plot.uvevd
in the evd
package,
from which these functions were derived.
The crosses are the empirical quantiles/return levels (i.e. the ordered sample data)
against their corresponding transformed empirical return period (from
ppoints
). The solid line is the theoretical return level
(quantile) function using the estimated parameters. The estimated threshold
u
and tail fraction phiu
are shown. For the two tailed models both
thresholds ul
and ur
and corresponding tail fractions
phiul
and phiur
are shown. The approximate pointwise confidence intervals
for the quantiles are obtained by Monte Carlo simulation using the estimated parameters.
Notice that these intervals ignore the parameter estimation uncertainty.
The Q-Q and P-P plots have the empirical values on the \(y\)-axis and theoretical values from the fitted model on the \(x\)-axis.
The density plot provides a histogram of the sample data overlaid with the fitted density
and a standard kernel density estimate using the density
function. The default settings for the density
function are used.
Note that for distributions with bounded support (e.g. GPD) with high density near the
boundary standard kernel density estimators exhibit a negative bias due to leakage past
the boundary. So in this case they should not be taken too seriously.
For the kernel density estimates (i.e. kden
and bckden
) there is no threshold,
so no upper tail focus is carried out.
See plot.uvevd
for more detailed explanations of these
types of plots.
http://en.wikipedia.org/wiki/Q-Q_plot
http://en.wikipedia.org/wiki/P-P_plot
Scarrott, C.J. and MacDonald, A. (2012). A review of extreme value threshold estimation and uncertainty quantification. REVSTAT - Statistical Journal 10(1), 33-59. Available from http://www.ine.pt/revstat/pdf/rs120102.pdf
Coles S.G. (2004). An Introduction to the Statistical Modelling of Extreme Values. Springer-Verlag: London.
ppoints
, plot.uvevd
and
gpd.diag
.
# NOT RUN {
set.seed(1)
x = sort(rnorm(1000))
fit = fnormgpd(x)
evmix.diag(fit)
# repeat without focussing on upper tail
par(mfrow=c(2,2))
rlplot(fit, upperfocus = FALSE)
qplot(fit, upperfocus = FALSE)
pplot(fit, upperfocus = FALSE)
densplot(fit, upperfocus = FALSE)
# }
# NOT RUN {
# }
Run the code above in your browser using DataLab