lmrob.S: S-regression estimators

Description

Computes an S-estimator for linear regression, using the “fast S” algorithm.

Usage

lmrob.S(x, y, control,
        trace.lev = control$trace.lev,
        only.scale = FALSE, mf = NULL)

Value

By default (when only.scale is false), a list with components

coefficients: numeric vector (length \(p\)) of S-regression coefficient estimates.
scale: the S-scale residual estimate

fitted.values: numeric vector (length \(n\)) of the fitted values.
residuals: numeric vector (length \(n\)) of the residuals.
rweights: numeric vector (length \(n\)) of the robustness weights.
k.iter: (maximal) number of refinement iterations used.
converged: logical indicating if all refinement iterations had converged.
control: the same list as the control argument.

If only.scale is true, the computed scale (a number) is returned.

Arguments

x: design matrix (\(n \times p\))
y: numeric vector of responses (or residuals for only.scale=TRUE).
control: list as returned by lmrob.control
trace.lev: integer indicating if the progress of the algorithm should be traced (increasingly); default trace.lev = 0 does no tracing.
only.scale: logical indicating if only the scale of y should be computed. In this case, y will typically contain residuals.

mf: unused and deprecated.

Author

Matias Salibian-Barrera and Manuel Koller; Martin Maechler for minor new options and more documentation.

Details

This function is used by lmrob.fit and typically not to be used on its own (because an S-estimator has too low efficiency ‘on its own’).

By default, the subsampling algorithm uses a customized LU decomposition which ensures a non singular subsample (if this is at all possible). This makes the Fast-S algorithm also feasible for categorical and mixed continuous-categorical data.

One can revert to the old subsampling scheme by setting the parameter subsampling in control to "simple".

Examples

Run this code

set.seed(33)
x1 <- sort(rnorm(30)); x2 <- sort(rnorm(30)); x3 <- sort(rnorm(30))
X. <- cbind(x1, x2, x3)
y <-  10 + X. %*% (10*(2:4)) + rnorm(30)/10
y[1] <- 500   # a moderate outlier
X.[2,1] <- 20 # an X outlier
X1  <- cbind(1, X.)

(m.lm <- lm(y ~ X.))
set.seed(12)
m.lmS <- lmrob.S(x=X1, y=y,
                 control = lmrob.control(nRes = 20), trace.lev=1)
m.lmS[c("coefficients","scale")]
all.equal(unname(m.lmS$coef), 10 * (1:4), tolerance = 0.005)
stopifnot(all.equal(unname(m.lmS$coef), 10 * (1:4), tolerance = 0.005),
          all.equal(m.lmS$scale, 1/10, tolerance = 0.09))

## only.scale = TRUE:  Compute the S scale, given residuals;
s.lmS <- lmrob.S(X1, y=residuals(m.lmS), only.scale = TRUE,
                 control = lmrob.control(trace.lev = 3))
all.equal(s.lmS, m.lmS$scale) # close: 1.89e-6 [64b Lnx]

Run the code above in your browser using DataLab