Performs a robust linear regression with high breakdown point and high efficiency regression.
lmRob(formula, data, weights, subset, na.action,
model = TRUE, x = FALSE, y = FALSE, contrasts = NULL,
nrep = NULL, control = lmRob.control(...), ...)
a list describing the regression. Note that the solution returned here is an approximation to the true solution based upon a random algorithm (except when "Exhaustive"
resampling is chosen). Hence you will get (slightly) different answers each time if you make the same call with a different seed. See lmRob.control
for how to set the seed, and see lmRob.object
for a complete description of the object returned.
a formula
object, with the response on the left of a ~ operator, and the terms, separated by +
operators, on the right.
a data.frame
in which to interpret the variables named in the formula
, or in the subset
and the weights
argument. If this is missing, then the variables in the formula
should be on
the search list. This may also be a single number to handle some special cases - see below for details.
vector of observation weights; if supplied, the algorithm fits to minimize the sum of a function of the square root of the weights multiplied into the residuals. The length of weights
must be the same as
the number of observations. The weights must be nonnegative and it is strongly recommended that they be strictly positive, since zero weights are ambiguous, compared to use of the subset
argument.
expression saying which subset of the rows of the data should be used in the fit. This can be a logical vector (which is replicated to have length equal to the number of observations), or a numeric vector indicating which observation numbers are to be included, or a character vector of the row names to be included. All observations are included by default.
a function to filter missing data. This is applied to the model.frame
after any subset
argument has been used. The default (with na.fail
) is to create an error if any missing values are found. A possible alternative is na.exclude
, which deletes observations that contain one or more missing values.
a logical flag: if TRUE
, the model frame is returned in component model
.
a logical flag: if TRUE
, the model matrix is returned in component x
.
a logical flag: if TRUE
, the response is returned in component y
.
a list giving contrasts for some or all of the factors appearing in the model formula. The elements of the list should have the same name as the variable and should be either a contrast matrix (specifically, any full-rank matrix with as many rows as there are levels in the factor), or else a function to compute such a matrix given the number of levels.
the number of random subsamples to be drawn. If "Exhaustive"
resampling is being used, the value of nrep
is ignored.
a list of control parameters to be used in the numerical algorithms. See lmRob.control
for the possible control parameters and their default settings.
additional arguments are passed to the ccontrol functions.
By default, the lmRob
function automatically chooses an appropriate algorithm to compute a final robust estimate with high breakdown point and high efficiency. The final robust estimate is computed based on an initial estimate with high breakdown point. For the initial estimation, the alternate M-S estimate is used if there are any factor variables in the predictor matrix, and an S-estimate is used otherwise. To compute the S-estimate, a random resampling or a fast procedure is used unless the data set is small, in which case exhaustive resampling is employed. See lmRob.control
for how to choose between the different algorithms.
Gervini, D., and Yohai, V. J. (1999). A class of robust and fully efficient regression estimates; mimeo, Universidad de Buenos Aires.
Marazzi, A. (1993). Algorithms, routines, and S functions for robust statistics. Wadsworth & Brooks/Cole, Pacific Grove, CA.
Maronna, R. A., and Yohai, V. J. (2000). Robust regression with both continuous and categorical predictors. Journal of Statistical Planning and Inference 89, 197--214.
Pena, D., and Yohai, V. (1999). A Fast Procedure for Outlier Diagnostics in Large Regression Problems. Journal of the American Statistical Association 94, 434--445.
Yohai, V. (1988). High breakdown-point and high efficiency estimates for regression. Annals of Statistics 15, 642--665.
Yohai, V., Stahel, W. A., and Zamar, R. H. (1991). A procedure for robust estimation and inference in linear regression; in Stahel, W. A. and Weisberg, S. W., Eds., Directions in robust statistics and diagnostics, Part II. Springer-Verlag.
lmRob.control
,
lmRob.object
.
data(stack.dat)
stack.rob <- lmRob(Loss ~ ., data = stack.dat)
Run the code above in your browser using DataLab