pwmRC: Sample Probability-Weighted Moments for Right-Tail Censoring

Description

Compute the sample Probability-Weighted Moments (PWMs) for right-tail censored data set---that is a data set censored from above. The censoring threshold is denoted as $T$. The data possess $m$ values that are observed (noncensored, $< T$) out of a total of $n$ samples. The ratio of $m$ to $n$ is defined as $\zeta = m/n$, which will play an important role in parameter estimation. The $\zeta$ is interpreted as the probability $\mathrm{Pr}\lbrace \rbrace$ that $x$ is less than the quantile at $\zeta$ nonexceedance probability: ($\mathrm{Pr}\lbrace x < X(\zeta) \rbrace$). Two types of PWMs are computed for right-tail censored situations. The A-type PWMs and B-type PWMs. The A-type PWMs are defined by

$$\beta^A_r = m^{-1}\sum^m_{j=1} {j-1 \choose r} x_{[j:n]}\mbox{,}$$

which are the PWMs of the uncensored sample of $m$ observed values. The B-type PWMs are computed from the complete sample, in which the $n-m$ censored values are replaced by the censoring threshold $T$. The B-type PWMs are defined by

$$\beta^B_r = n^{-1} \biggl( \sum^m_{j=1} {j-1 \choose r} x_{[j:n]} + \sum^n_{j=m+1} {j-1 \choose r} T \biggr) \mbox{.}$$

The two previous expressions are used in the function. These PWMs are readily converted to L-moments by the usual methods (pwm2lmom). When there are more than a few censored values, the PWMs are readily computed by computing $\beta^A_r$ and using the expression

$$\beta^B_r = Z\beta^A_r + \frac{1-Z}{r+1}T\mbox{,}$$

where

$$Z = \frac{m}{n}\frac{{m-1 \choose r}}{{n-1 \choose r}}\mbox{.}$$

The two expressions above are consulted when the checkbetas=TRUE argument is present. Both sequences of B-type are cated to the terminal. This provides a check on the implementation of the algorithm. The functions Apwm2BpwmRC and Bpwm2ApwmRC can be used to switch back and forth between the two PWM types given fitted parameters for a distribution in the lmomco package that supports right-tail censoring. Finally, the RC in the function name is to denote Right-tail Censoring. pwmRC(x,threshold=NULL,nmom=5,sort=TRUE,checkbetas=FALSE) x{A vector of data values.} threshold{The right-tail censoring (upper) threshold.} nmom{Number of PWMs to return.} sort{Does the data need sorting? Note that convention is the have a $\beta_0$, but this is placed in the first index i=1 of the betas vector.} checkbetas{A cross relation between $\beta^A_r$ and $\beta^B_r$ exists---display the results of the secondary computation of the $\beta^B_r$. The two displayed vectors should be numerically equal.} An R list is returned.

Abetas{The A-type PWMs. These should be same as pwm() returns if there is no censoring. Note that convention is the have a $\beta_0$, but this is placed in the first index i=1 of the betas vector.} Bbetas{The B-type PWMs. These should be NA if there is no censoring. Note that convention is the have a $\beta_0$, but this is placed in the first index i=1 of the betas vector.} source{Source of the PWMs: pwmRC} threshold{The upper censoring threshold.} zeta{The right censoring fraction: numabovethreshold/samplesize} numabovethreshold{Number of data points equal to or above the threshold.} observedsize{Number of real data points in the sample (below the threshold).} samplesize{Number of actual sample values.}

There is some ambiguity if the threshold also numerically equals valid data in the data set. In the data for the examples below, which are taken from elsewhere, there are real observations at the censoring level. One can see how a hack is made to marginally decrease or increase the data or the threshold for the computations. This is needed because the code uses sapply(x,function(v) { if(v >= T) return(T); return(v)}) to reset the data vector x. By operating on the data in this fashion one can toy with various levels of the threshold for experimental purposes; this seemed a more natural way for general implementation. The code sets $n$=length(x) and $m$=n - length(x[x == T]), which also seems natural. The $\beta^A_r$ are computed by dispatching to pwm.

Greenwood, J.A., Landwehr, J.M., Matalas, N.C., and Wallis, J.R., 1979, Probability weighted moments---Definition and relation to parameters of several distributions expressable in inverse form: Water Resources Research, vol. 15, p. 1,049--1,054.

Hosking, J.R.M., 1990, L-moments---Analysis and estimation of distributions using linear combinations of order statistics: Journal of the Royal Statistical Society, Series B, vol. 52, p. 105--124.

Hosking, J.R.M., 1995, The use of L-moments in the analysis of censored data, in Recent Advances in Life-Testing and Reliability, edited by N. Balakrishnan, chapter 29, CRC Press, Boca Raton, Fla., pp. 546--560. [object Object] lmoms, pwm2lmom, pwm # Data listed in Hosking (1995, table 29.2, p. 551) H <- c(3,4,5,6,6,7,8,8,9,9,9,10,10,11,11,11,13,13,13,13,13, 17,19,19,25,29,33,42,42,51.9999,52,52,52) # 51.9999 was really 52, a real (noncensored) data point. z <- pwmRC(H,threshold=52,checkbetas=TRUE) str(z) # Hosking(1995) reports that A-type L-moments for this sample are # lamA1=15.7 and lamAL-CV=.389, and lamAL-skew=.393 pwm2lmom(z$Abetas) # My version of R reports 15.666, 0.3959, and 0.4030

# See p. 553 of Hosking (1995) # Data listed in Hosking (1995, table 29.3, p. 553) D <- c(-2.982, -2.849, -2.546, -2.350, -1.983, -1.492, -1.443, -1.394, -1.386, -1.269, -1.195, -1.174, -0.854, -0.620, -0.576, -0.548, -0.247, -0.195, -0.056, -0.013, 0.006, 0.033, 0.037, 0.046, 0.084, 0.221, 0.245, 0.296) D <- c(D,rep(.2960001,40-28)) # 28 values, but Hosking mentions 40 values in total z <- pwmRC(D,.2960001) # Hosking reports B-type L-moments for this sample are # lamB1 = -.516 and lamB2 = 0.523 pwm2lmom(z$Bbetas) # My version of R reports -.5162 and 0.5218 univar distribution

Description

Arguments