Computes discordancy, heterogeneity and goodness-of-fit measures for regional frequency analysis. These are the statistics \(D_i\), \(H\), and \(Z^{\rm DIST}\) defined respectively in sections 3.2.3, 4.3.3, and 5.2.3 of Hosking and Wallis (1997).
regtst(regdata, nsim=1000)regtst.s(regdata, nsim=1000)
An object of class "regtst"
, which is a list with elements as follows.
The input data, i.e. data frame regdata
after coercion to class "regdata"
if necessary.
Number of simulations, i.e. the argument nsim
.
Vector containing the discordancy measures for each site.
Vector of length 2 containing critical values of the discordancy measure corresponding to significance levels of 10 and 5 per cent --- except that the values never exceed 3 and 4 respectively. See Hosking and Wallis (1997), section 3.2.4.
Vector of length 5 containing the regional weighted average \(L\)-moment ratios (weights proportional to record lengths).
Vector of length 4 containing the parameters of a kappa distribution fitted to the regional weighted average \(L\)-moment ratios.
Vector of length 3 containing the observed values of the three measures of between-site dispersion of \(L\)-moment ratios.
Vector of length 3 containing the mean of the simulated values of the three dispersion measures.
Vector of length 3 containing the standard deviation of the simulated values of the three dispersion measures.
Vector of length 3 containing the three measures of regional heterogeneity.
List of length 6 containing the parameters of the five candidate
distributions and the Wakeby distribution (3-letter abbreviation
"wak"
) fitted to the regional weighted average
\(L\)-moment ratios.
Vector of length 5 containing the \(L\)-kurtosis of the five candidate distributions fitted to the regional weighted average \(L\)-moment ratios.
Vector of length 5 containing the goodness-of-fit measures for each of the five candidate distributions.
Object of class regdata
containing the input data.
It should be a data frame, each of whose rows contains data for one site.
The first seven columns should contain respectively
the site name, record length and \(L\)-moments
and \(L\)-moment ratios, in the order
\(\ell_1\) (mean),
\(t\) (\(L\)-CV),
\(t_3\) (\(L\)-skewness),
\(t_4\) (\(L\)-kurtosis),
and \(t_5\).
Note that the fourth column should contain values of the \(L\)-CV \(t\), not the \(L\)-scale \(\ell_2\)!
Function regsamlmu
, with default settings
of its arguments, returns an object of class "regdata"
.
Number of simulations to use in the calculation of the heterogeneity and goodness-of-fit measures.
If less than 2, only the discordancy measure will be calculated.
J. R. M. Hosking jrmhosking@gmail.com
The discordancy measure \(D_i\) indicates, for site \(i\),
the discordancy between the site's \(L\)-moment ratios
and the (unweighted) regional average \(L\)-moment ratios.
Large values might be used as a flag to indicate potential errors
in the data at the site. “Large” might be 3 for regions with 15
or more sites, but less (exact values in list element Dcrit
)
for smaller regions.
Three heterogeneity measures are calculated, each based on
a different measure of between-site dispersion of \(L\)-moment ratios:
[1] weighted standard deviation of \(L\)-CVs;
[2] average of \(L\)-CV/\(L\)-skew distances;
[3] average of \(L\)-skew/\(L\)-kurtosis distances.
These dispersion measures are the quantities \(V\), \(V_2\),
and \(V_3\) defined respectively in equations (4.4), (4.6), and (4.7)
of Hosking and Wallis (1997).
The heterogeneity measures are calculated from them as in
equation (4.5) of Hosking and Wallis (1997).
In practice H[1]
is probably sufficient. A value greater than
(say) 1.0 suggests that further subdivision of the region should
be considered as it might improve the accuracy of quantile estimates.
Goodness of fit is evaluated for five candidate distributions:
generalized logistic,
generalized extreme value,
generalized normal (lognormal),
Pearson type III (3-parameter gamma), and
generalized Pareto.
In the output the distributions are referred to by 3-letter abbreviations,
respectively glo
, gev
, gno
, pe3
, and gpa
.
If the region is homogeneous and data at different sites are
statistically independent, then if one of the distributions is
the true distribution for the region its goodness-of-fit measure
should have approximately a standard normal distribution.
Provided that the region is acceptably close to homogeneous,
the fit may be judged acceptable at the 10 per cent significance level
if the \(Z\) value is less than 1.645 (i.e., qnorm(0.95)
) in absolute value.
Calculation of heterogeneity and goodness-of-fit measures
involves the sampling variability of \(L\)-moment ratios
in a homogeneous region whose record lengths and
average \(L\)-moment ratios match those of the data.
The sampling variability is estimated by Monte Carlo simulation
using nsim
replications of the region.
Results will vary between invocations of regtst
with different seeds for the random-number generator.
In the homogeneous region used in the simulations, the sites have a
kappa distribution, fitted to the regional average \(L\)-moment ratios
of the data in regdata
. The kappa fit may fail if the regional average
\(L\)-kurtosis is high relative to the regional average \(L\)-skewness.
In this case a kappa distribution is fitted with shape parameter
\(h\) constrained to be \(-1\) (i.e., a generalized logistic distribution);
this gives the largest possible \(L\)-kurtosis value for a kappa distribution
with given \(L\)-skewness.
regtst
and regtst.s
are functionally identical.
regtst
calls a Fortran routine internally and is faster,
typically by a factor of 3 or 4.
regtst.s
is written almost entirely in the S language;
it is provided so that users can see how the calculations are done,
and can conveniently alter the code for their own purposes if necessary.
Hosking, J. R. M. (1996). Fortran routines for use with the method of \(L\)-moments, Version 3. Research Report RC20525, IBM Research Division, Yorktown Heights, N.Y.
Hosking, J. R. M., and Wallis, J. R. (1997). Regional frequency analysis: an approach based on \(L\)-moments. Cambridge University Press.
summary.regtst
for summaries.
# An example from Hosking (1996). Compare the output with
# the file 'cascades.out' in the LMOMENTS Fortran package at
# https://lib.stat.cmu.edu/general/lmoments (results will not
# be identical, because random-number generators are different).
summary(regtst(Cascades, nsim=500))
# Output from 'regsamlmu' can be fed straight into 'regtst'
regtst(regsamlmu(Maxwind))
Run the code above in your browser using DataLab