Fit a regression to the good points in the dataset, thereby
achieving a regression estimator with a high breakdown point.
lmsreg
and ltsreg
are compatibility wrappers.
lqs(x, …)# S3 method for formula
lqs(formula, data, …,
method = c("lts", "lqs", "lms", "S", "model.frame"),
subset, na.action, model = TRUE,
x.ret = FALSE, y.ret = FALSE, contrasts = NULL)
# S3 method for default
lqs(x, y, intercept = TRUE, method = c("lts", "lqs", "lms", "S"),
quantile, control = lqs.control(…), k0 = 1.548, seed, …)
lmsreg(…)
ltsreg(…)
a formula of the form y ~ x1 + x2 + …
.
data frame from which variables specified in
formula
are preferentially to be taken.
an index vector specifying the cases to be used in fitting. (NOTE: If given, this argument must be named exactly.)
function to specify the action to be taken if
NA
s are found. The default action is for the procedure to
fail. Alternatives include na.omit
and
na.exclude
, which lead to omission of
cases with missing values on any required variable. (NOTE: If
given, this argument must be named exactly.)
logical. If TRUE
the model frame,
the model matrix and the response are returned, respectively.
an optional list. See the contrasts.arg
of model.matrix.default
.
a matrix or data frame containing the explanatory variables.
the response: a vector of length the number of rows of x
.
should the model include an intercept?
the method to be used. model.frame
returns the model frame: for the
others see the Details
section. Using lmsreg
or
ltsreg
forces "lms"
and "lts"
respectively.
the quantile to be used: see Details
. This is over-ridden if
method = "lms"
.
additional control items: see Details
.
the cutoff / tuning constant used for \(\chi()\)
and \(\psi()\) functions when method = "S"
, currently
corresponding to Tukey's ‘biweight’.
the seed to be used for random sampling: see .Random.seed
. The
current value of .Random.seed
will be preserved if it is set..
arguments to be passed to lqs.default
or
lqs.control
, see control
above and Details
.
An object of class "lqs"
. This is a list with components
the value of the criterion for the best solution found, in
the case of method == "S"
before IWLS refinement.
character. A message about the number of samples which resulted in singular fits.
of the fitted linear model
the indices of those points fitted by the best sample found (prior to adjustment of the intercept, if requested).
the fitted values.
the residuals.
estimate(s) of the scale of the error. The first is based
on the fit criterion. The second (not present for method ==
"S"
) is based on the variance of those residuals whose absolute
value is less than 2.5 times the initial estimate.
Suppose there are n
data points and p
regressors,
including any intercept.
The first three methods minimize some function of the sorted squared
residuals. For methods "lqs"
and "lms"
is the
quantile
squared residual, and for "lts"
it is the sum
of the quantile
smallest squared residuals. "lqs"
and
"lms"
differ in the defaults for quantile
, which are
floor((n+p+1)/2)
and floor((n+1)/2)
respectively.
For "lts"
the default is floor(n/2) + floor((p+1)/2)
.
The "S"
estimation method solves for the scale s
such that the average of a function chi of the residuals divided
by s
is equal to a given constant.
The control
argument is a list with components
psamp
:the size of each sample. Defaults to p
.
nsamp
:the number of samples or "best"
(the
default) or "exact"
or "sample"
.
If "sample"
the number chosen is min(5*p, 3000)
,
taken from Rousseeuw and Hubert (1997).
If "best"
exhaustive enumeration is done up to 5000 samples;
if "exact"
exhaustive enumeration will be attempted however
many samples are needed.
adjust
:should the intercept be optimized for each
sample? Defaults to TRUE
.
P. J. Rousseeuw and A. M. Leroy (1987) Robust Regression and Outlier Detection. Wiley.
A. Marazzi (1993) Algorithms, Routines and S Functions for Robust Statistics. Wadsworth and Brooks/Cole.
P. Rousseeuw and M. Hubert (1997) Recent developments in PROGRESS. In L1-Statistical Procedures and Related Topics, ed Y. Dodge, IMS Lecture Notes volume 31, pp. 201--214.
# NOT RUN {
set.seed(123) # make reproducible
lqs(stack.loss ~ ., data = stackloss)
lqs(stack.loss ~ ., data = stackloss, method = "S", nsamp = "exact")
# }
Run the code above in your browser using DataLab