mps2par: Use Maximum Product of Spacings to Estimate the Parameters of a Distribution

Description

This function uses the method of maximum product of spacings (MPS) (maximum spacing estimation or maximum product of spacings estimation) to estimate the parameters of a distribution. MPS is based on maximization of the geometric mean of probability spacings in the data where the spacings are defined as the differences between the values of the cumulative distribution function, $F(x)$, at sequential data indices.

MPS (Dey et al., 2016, pp. 13--14) is an optimization problem formed by maximizing the geometric mean of the spacing between consecutively ordered observations standardized to a U-statistic. Let $\Theta$ represent a vector of parameters for a candidate fit of $F(x|\Theta)$, and let $U_i(\Theta) = F(X_{i:n}|\Theta)$ be the nonexceedance probabilities of the observed values of the order statistics $x_{i:n}$ for a sample of size $n$. Define the differences $$D_i(\Theta) = U_i(\Theta) - U_{i-1}(\Theta)\mbox{\ for\ } i = 1, \ldots, n+1\mbox{,}$$ with the additions to the vector $U$ of $U_0(\Theta) = 0$ and $U_{n+1}(\Theta) = 1$. The objective function is $$M_n(\Theta) = - \sum_{i=1}^{n+1} \log\, D_i(\Theta)\mbox{,}$$ where the $\Theta$ for a maximized ${-}M_n$ represents the parameters fit by MPS. Some authors to keep with the idea of geometric mean include factor of $1/(n+1)$ for the definition of $M_n$. Whereas other authors (Shao and Hahn, 1999, eq. 2.0), show $$S_n(\Theta) = (n+1)^{-1} \sum_{i=1}^{n+1} \log[(n+1)D_i(\Theta)]\mbox{.}$$ So it seems that some care is needed when considering the implementation when the value of “the summation of the logarithms” is to be directly interpreted. Wong and Li (2006) provide a salient review of MPS in regards to an investigation of maximum likelihood (MLE), MPS, and probability-weighted moments (pwm) for the GEV (quagev) and GPA (quagpa) distributions. Finally, Soukissian and Tsalis (2015) also study MPS, MLE, L-moments, and several other methods for GEV fitting.

If the initial parameters have a support inside the range of the data, infinity is returned immediately by the optimizer and further action stops and the parameters returned are NULL. For the implementation here, if check.support is true, and the initial parameter estimate (if not provided and acceptable by init.para) by default will be seeded through the method of L-moments (unbiased, lmoms), which should be close and convergence will be fairly fast if a solution is possible. If these parameters can not be used for spinup, the implementation will then attempt various probability-weighted moment by plotting position (pwm.pp) converted to L-moments (pwm2lmom) as part of an extended attempt to find a support of the starting distribution encompass the data. Finally, if that approach fails, a last ditch effort using starting parameters from maximum likelihood computed by a default call to mle2par is made. Sometimes data are pathological and user supervision is needed but not always successful---MPS can show failure for certain samples and(or) choice of distribution.

It is important to remark that the support of a fitted distribution is not checked within the loop for optimization once spun up. The reasons are twofold: (1) The speed hit by repeated calls to supdist, but in reality (2) PDFs in lmomco are supposed to report zero density for outside the support of a distribution (see NEWS) and for the $-\log(D_i(\Theta)\rightarrow 0) \rightarrow \infty$ and hence infinity is returned for that state of the optimization loop and alternative solution will be tried.

As a note, if all $U$ are equally spaced, then $|M(\Theta)| = I_o = (n+1)\log(n+1)$. This begins the concept towards goodness-of-fit. The $M_n(\Theta)$ is a form of the Moran-Darling statistic for goodness-of-fit. The $M_n(\Theta)$ is a Normal distribution with $$\mu_M \approx (n+1)[\log(n+1)+\gamma{}] - \frac{1}{2} - \frac{1}{12(n+1)}\mbox{,}$$ $$\sigma_M \approx (n+1)\biggl(\frac{\pi^2}{6\,{}} - 1\biggr)-\frac{1}{2} - \frac{1}{6(n+1)}\mbox{,}$$ where $\gamma \approx 0.577221$ (Euler--Mascheroni constant, -digamma(1)) or as the definite integral

$$\gamma^\mathrm{Euler}_{\mathrm{Mascheroni}} = -\int_0^\infty \mathrm{exp}(-t) \log(t)\; \mathrm{d}{t}\mbox{,}$$

An extension into small samples using the Chi-Square distribution is $$A = C_1 + C_2\times\chi^2_n\mbox{,}$$ where $$C_1 = \mu_M - \sqrt{\frac{\sigma^2_M\,n}{2}}\mbox{\ and\ }C_2 = \sqrt{\frac{\sigma^2_M}{2n}}\mbox{,}$$ and where $\chi^2_n$ is the Chi-Square distribution with $n$ degrees of freedom. A test statistic is $$T(\Theta) = \frac{M_n(\Theta) - C_1 + \frac{p}{2}}{C_2}\mbox{,}$$ where the term $p/2$ is a bias correction based on the number of fitted distribution parameters $p$. The null hypothesis that the fitted distribution is correct is to be rejected if $T(\Theta)$ exceeds a critical value from the Chi-Square distribution. The MPS method has a relation to maximum likelihood (mle2par) and the two are asymptotically equivalent.

Important Remark Concerning Ties---Ties in the data cause instant degeneration with MPS and must be mitigated for and thus attention to this documentation and even the source code itself is required.

Usage

mps2par(x, type, init.para=NULL, ties=c("bernstein", "rounding", "density"),
            delta=0, log10offset=3, get.untied=FALSE, check.support=TRUE,
            moran=TRUE, silent=TRUE, null.on.not.converge=TRUE,
            ptransf=  function(t) return(t),
            pretransf=function(t) return(t),
            mle2par=TRUE, ...)

Value

An R

list is returned. This list should contain at least the following items, but some distributions such as the revgum have extra.

type: The type of distribution in three character (minimum) format.
para: The parameters of the distribution.
source: Attribute specifying source of the parameters.
init.para: The initial parameters. Warning to users, when inspecting returned values make sure that one is referencing the MPS parameters in para and not those shown in init.para!
optim: An optional list of returned content from the optimizer if not silent.
ties: An optional list of untied-pseudo data and number of iterations required to achieve no ties (usually unity!) if and only if there were ties in the original data, get.untied is true, and ties != "density".
MoranTest: An optional list of returned values that will include both diagnostics and statistics. The diagnostics are the computed $\mu_M(n)$, $\sigma^2_M(n)$, $C_1$, $C_2$, and $n$. The statistics are the minimum value $I_o$ theoretically attainable $|M_n(\Theta)|$ for equally spaced differences, the minimized value $M_n(\Theta)$, the $T(\Theta)$, and the corresponding p.value from the upper tail of the $\chi^2_n$ distribution.

Arguments

x: A vector of data values.
type: Three character (minimum) distribution type (for example, type="gev", see dist.list).
init.para: Initial parameters as a vector $\Theta$ or as an lmomco parameter “object” from say vec2par. If a vector is given, then internally vec2par is called with distribution equal to type.
ties: Ties cause degeneration in the computation of $M(\Theta)$:
Option bernstein triggers a smoothing of only the ties using the dat2bernqua function---Bernstein-type smoothing for ties is likely near harmless when ties are near the center of the distribution, but of course caution is advised if ties exist near the extremal values; the settings for log10offset and delta are ignored if bernstein is selected Also for a tie-run having an odd number of elements, the middle tied value is left as original data.
Option rounding triggers two types of adjustment: if delta > 0 then a round-off error approach inspired by Cheng and Stephens (1989, eq. 4.1) is used (see Note) and log10offset is ignored, but if delta=0, then log10offset is picked up as an order of magnitude offset (see Note). Use of options log10offset and delta are likely to not keep a middle unmodified in an odd-length, tie-run in contrast to use of bernstein.
Option density triggers the substitution of the probability density $g(x_{i:n}|\Theta)$ at the $i$th tie from the current fit of the distribution. Warning---It appears that inference is lost almost immediately because the magnitude of $M_n$ losses meaning because probability densities are not in the same scale as changes in probabilities exemplified by the $D_i$. This author has not yet found literature discussing this, but density substitution is a recognized strategy.
delta: The optional $\delta$ value if $\delta > 0$ and if ties="rounding".
log10offset: The optional base-10 logarithmic offset approach to roundoff errors if delta=0 and if ties="rounding".
get.untied: A logical to populate a ties element in the returned list with the untied-pseudo data as it was made available to the optimizer and the number of iternations required to exhaust all ties. An emergency break it implemented if the number of iterations appears to be blowing up.
check.support: A logical to trigger a call to supdist to compute the support of the distribution at the initial parameters. As mentioned, MPS degenerates if min(x) $<$ the lower support or if max(x) $>$ the upper support. Regardless of the setting of check.support and NULL will be returned because this is what the optimizer will do anyway.
moran: A logical to trigger the goodness-of-fit test described previously.
silent: A logical to silence the try() function wrapping the optim() function and to provide a returned list of the optimization output.
null.on.not.converge: A logical to trigging simple return of NULL if the optim() function returns a nonzero convergence status.
ptransf: An optional parameter transformation function (see Examples) that is useful to guide the optimization run. For example, suppose the first parameter of a three parameter distribution resides in the positive domain, then
ptransf(t) = function(t) c(log(t[1]), t[2], t[3]).
pretransf: An optional parameter retransformation function (see Examples) that is useful to guide the optimization run. For example, suppose the first parameter of a three parameter distribution resides in the positive domain, then
pretransf(t) = function(t) c(exp(t[1]), t[2], t[3]).
mle2par: A logical to turn off the potential last attempt at maximum likelihood estimates of a valid seed as part of check.support=TRUE.
...: Additional arguments for the optim() function and other uses.

Author

W.H. Asquith

References

Cheng, R.C.H., Stephens, M.A., 1989, A goodness-of-fit test using Moran's statistic with estimated parameters: Biometrika, v. 76, no. 2, pp. 385--392.

Dey, D.K., Roy, Dooti, Yan, Jun, 2016, Univariate extreme value analysis, chapter 1, in Dey, D.K., and Yan, Jun, eds., Extreme value modeling and risk analysis---Methods and applications: Boca Raton, FL, CRC Press, pp. 1--22.

Shao, Y., and Hahn, M.G., 1999, Strong consistency of the maximum product of spacings estimates with applications in nonparametrics and in estimation of unimodal densities: Annals of the Institute of Statistical Mathematics, v. 51, no. 1, pp. 31--49.

Soukissian, T.H., and Tsalis, C., 2015, The effect of the generalized extreme value distribution parameter estimation methods in extreme wind speed prediction: Natural Hazards, v. 78, pp. 1777--1809.