gettau: Estimate of the scale parameter tau

Description

An estimate of the scale parameter tau may be used for the standard errors of the coefficients in rank-based regression.

Usage

gettau(ehat, p, scores = Rfit::wscores, delta = 0.8, hparm = 2, ...)
gettauF0(ehat, p, scores = Rfit::wscores, delta = 0.8, hparm = 2, ...)

Value

Length one numeric object.

Arguments

ehat: vector of length n: full model residuals
p: scalar: number of regression coefficients (excluding the intercept); see Details
scores: object of class scores, defaults to Wilcoxon scores
delta: confidence level; see Details
hparm: used in Huber's degrees of freedom correction; see Details
...: additional arguments. currently unused

Author

Joseph McKean, John Kloke

Details

For rank-based analyses of linear models, the estimator \(\hat{\tau}\) of the scale parameter \(\tau\) plays a standardizing role in the standard errors (SE) of the rank-based estimators of the regression coefficients and in the denominator of Wald-type and the drop-in-dispersion test statistics of linear hypotheses. rfit currently implements the KSM (Koul, Sievers, and McKean 1987) estimator of tau.

The functions gettau and gettauF0 are both available to compute the KSM estimate and may be call from rfit and used for inference. The default is to use the faster FORTRAN version gettauF0 via the to option TAU='F0'. The R version, gettau, may be much slower especially when sample sizes are large; this version may be called from rfit using the option TAU='R'.

The KSM estimator tauhat is a density type estimator that has the bandwidth given by \(t_\delta/sqrt{n}\), where \(t_\delta\) is the \(\delta-th\) quantile of the cdf \(H(y)\) given in expression (3.7.2) of Hettmansperger and McKean (2011), with the corresponding estimator \(\hat{H}\), given in expression (3.7.7) of Hettmansperger and McKean (2011).

Based on simulation studies, most situations where (n/p >= 6), the default delta = 0.80 provides a valid rank-based analysis (McKean and Sheather, 1991). For situations with n/p < 6, caution is needed as the KSM estimate is sensitive to choice of bandwidth. McKean and Sheather (1991) recommend using a value of 0.95 for delta in such situations.

To correct for heavy-tailed random errors, Huber (1973) proposed a degree of freedom correction for the M-estimate scale parameter. The correction is given by \(K = 1 + [p*(1-h_c)/n*h_c]\) where \(h_c\) is the proportion of standardized residuals in absolute value less than the parameter hparm. This correction \(K\) is used as a multiplicative factor to tauhat. The default value of hparm is set at 2.

The usual degrees of freedom correction, \(\sqrt{n/(n-p)}\), is also used as a multiplicative factor to tauhat.

References

Hettmansperger, T.P. and McKean J.W. (2011), Robust Nonparametric Statistical Methods, 2nd ed., New York: Chapman-Hall.

Huber, P.J. (1973), Robust regression: Asymptotics, conjectures and Monte Carlo, Annals of Statistics, 1, 799--821.

Koul, H.L., Sievers, G.L., and McKean, J.W. (1987), An estimator of the scale parameter for the rank analysis of linear models under general score functions, Scandinavian Journal of Statistics, 14, 131--141.

McKean, J. W. and Sheather, S. J. (1991), Small Sample Properties of Robust Analyses of Linear Models Based on R-Estimates: A Survey, in Directions in Robust Statistics and Diagnostics, Part II, Editors: W.\ Stahel and S.\ Weisberg, Springer-Verlag: New York, 1--19.

Examples

Run this code

#  For a standard normal distribution the parameter tau has the value 1.023327 (sqrt(pi/3)).
set.seed(283643659)
n <- 12; p <- 6; y <- rnorm(n); x <- matrix(rnorm(n*p),ncol=p)
tau1 <- rfit(y~x)$tauhat; tau2 <- rfit(y~x,delta=0.95)$tauhat
c(tau1,tau2) # 0.5516708 1.0138415
n <- 120; p <- 6; y <- rnorm(n); x <- matrix(rnorm(n*p),ncol=p)
tau3 <- rfit(y~x)$tauhat; tau4 <- rfit(y~x,delta=0.95)$tauhat
c(tau3,tau4) # 1.053974 1.041783

Run the code above in your browser using DataLab