lmWinsor(formula, data, lower=NULL, upper=NULL, trim=0,
quantileType=7, subset, weights=NULL, na.action,
model = TRUE, x = FALSE, y = FALSE, qr = TRUE,
singular.ok = TRUE, contrasts = NULL, offset=NULL,
method=c('QP', 'clip'), eps=sqrt(.Machine$double.eps),
trace=ceiling(sqrt(nrow(data))), ...)
If a numeric vector, it must have names matching columns of 'data' giving limits on the ranges of predictors and predictions: If present, values below 'lower' will be increased to 'lower', and v
quantile
.
An object of class c('lmWinsor', 'lm') has 'lower', 'upper', 'out', 'message', and 'elapsed.time' components in addition to the standard 'lm' components. The 'out' component is a logical matrix identifying which predictions from the initial 'lm' fit were below lower[yName] and above upper[yName]. If method = 'QP' and the initial fit produces predictions outside the limits, this object returned will also include a component 'coefIter' containing the model coefficients, the index number of the observation in 'data' transferred from the objective function to the constraints on that iteration, plus the sum of squared residuals before and after clipping the predictions and the number of predictions in 5 categories: below and at the lower limit, between the limits, and at and above the upper limit. The 'elapsed.time' component gives the run time in seconds.
The options for 'message' are as follows:
2. Do 'lower' and 'upper' contain limits for all numeric columns of 'data? Create limits to fill any missing.
3. clipData = data with all xNames clipped to (lower, upper).
4. fit0 <- lm(formula, clipData, subset = subset, weights = weights, na.action = na.action, method = method, x=x, y=y, qr=qr, singular.ok=singular.ok, contrasts=contrasts, offset=offset, ...)
5. out = a logical matrix with two columns, indicating any of predict(fit0) outside (lower, upper)[yName].
6. Add components lower and upper to fit0 and convert it to class c('lmWinsor', 'lm').
7. If((method == 'clip') || !any(out)), return(fit0).
8. Else, use quadratic programming (solve.QP) to minimize the 'Winsorized sum of squares of residuals' as follows:
8.1. First find the prediction farthest outside (lower, upper)[yNames]. Set temporary limits at the next closest point inside that point (or at the limit if that's closer).
8.2. Use QP to minimize the sum of squares of residuals among all points not outside the temporary limits while keeping the prediction for the exceptional point away from the interior of (lower, upper)[yNames].
8.3. Are the predictions for all points unconstrained in QP inside (lower, upper)[yNames]? If yes, quit.
8.4. Otherwise, among the points still unconstrained, find the prediction farthest outside (lower, upper)[yNames]. Adjust the temporary limits to the next closest point inside that point (or at the limit if that's closer).
8.5. Use QP as in 8.2 but with multiple exceptional points, then return to step 8.3.
9. Modify the components of fit0 as appropriate and return the result.
predict.lmWinsor
lmeWinsor
lm
quantile
solve.QP
# example from 'anscombe'
lm.1 <- lmWinsor(y1~x1, data=anscombe)
# no leverage to estimate the slope
lm.1.5 <- lmWinsor(y1~x1, data=anscombe, trim=0.5)
# test nonlinear optimization
lm.1.25 <- lmWinsor(y1~x1, data=anscombe, trim=0.25)
# list example
lm.1. <- lmWinsor(y1~x1, data=anscombe, trim=c(0, 0.25, .4, .5))
Run the code above in your browser using DataLab