Plot the residuals of a regression model.
Please see the plotres vignette (also available here).
plotres(object = stop("no 'object' argument"),
which = 1:4, info = FALSE, versus = 1,
standardize = FALSE, delever = FALSE, level = 0,
id.n = 3, labels.id = NULL, smooth.col = 2,
grid.col = 0, jitter = 0,
do.par = NULL, caption = NULL, trace = 0,
npoints = 3000, center = TRUE,
type = NULL, nresponse = NA,
object.name = quote.deparse(substitute(object)), ...)
If the which=1
plot was plotted, the return value of that
plot (model dependent).
Else if the which=3
plot was plotted, return list(x,y)
where x
and y
are the coordinates of the points in that plot
(but without jittering even if the jitter
argument was used).
Else return NULL
.
The model object.
Which plots do draw. Default is 1:4
.
1
Model plot. What gets plotted here depends on the model class.
For example, for earth
models this is a model selection plot.
Nothing will be displayed for some models.
For details, please see the
plotres vignette.
2
Cumulative distribution of abs residuals
3
Residuals vs fitted
4
QQ plot
5
Abs residuals vs fitted
6
Sqrt abs residuals vs fitted
7
Abs residuals vs log fitted
8
Cube root of the squared residuals vs log fitted
9
Log abs residuals vs log fitted
Default is FALSE
.
Use TRUE
to print extra information as follows:
i) Display the distribution of the residuals along the bottom of the plot.
ii) Display the training R-Squared.
iii) Display the Spearman Rank Correlation of the absolute residuals
with the fitted values.
Actually, correlation is measured against the absolute values
of whatever is on the horizontal
axis --- by default this is the fitted response, but may be something
else if the versus
argument is used.
iv) In the Cumulative Distribution plot (which=2
),
display additional information on the quantiles.
v) Only for which=5
or 9
.
Regress the absolute residuals against the fitted values
and display the regression slope.
Robust linear regression is used via rlm
in the MASS package.
vi) Add various annotations to the other plots.
What do we plot the residuals against? One of:
1
Default. Plot the residuals versus the fitted values
(or the log values when which=7
to 9
).
2
Residuals versus observation number,
after observations have been sorted on the fitted value.
Same as versus=1
, except that the residuals are spaced
uniformly along the horizontal axis.
3
Residuals versus the response.
4
Residuals versus the hat leverages.
"b:"
Residuals versus the basis functions.
Currently only supported for earth
, mda::mars
, and gam::gam
models.
A optional regex
can follow the "b:"
to specify a subset of the
terms, e.g. versus="b:wind"
will plot terms with "wind"
in
their name.
Else a character vector specifying which predictors to plot against.
Example 1: versus=""
plots against all predictors (since the
regex versus=""
matches anything).
Example 2: versus=c("wind", "vis")
plots predictors
with wind
or vis
in their name.
Example 3: versus=c("wind|vis")
equivalent to the above.
Note: These are regex
s.
Thus versus="wind"
will match all variables that have "wind"
in their names. Use "^wind$"
to match only the variable named
"wind"
.
Default is FALSE
.
Use TRUE
to standardize the residuals.
Only supported for some models, an error message will be issued otherwise.
Each residual is divided by by se_i * sqrt(1 - h_ii)
,
where se_i
is the standard error of prediction
and h_ii
is the leverage (the diagonal entry of the hat matrix).
When the variance model holds, the standardized residuals are
homoscedastic with unity variance.
The leverages are obtained using hatvalues
.
(For earth
models the leverages are
for the linear regression of the response on the basis matrix bx
.)
A standardized residual with a leverage of 1 is plotted as a star on the axis.
This argument applies to all plots where the residuals are used
(including the cumulative distribution and QQ plots, and to
annotations displayed by the info
argument).
Default is FALSE
.
Use TRUE
to “de-lever” the residuals.
Only supported for some models, an error message will be issued otherwise.
Each residual is divided by sqrt(1 - h_ii)
.
See the standardize
argument for details.
Draw estimated confidence or prediction interval bands at the given
level
, if the model supports them.
Default is 0
, bands not plotted.
Else a fraction, for example level=0.90
.
Example:
mod <- lm(log(Volume)~log(Girth), data=trees)
plotres(mod, level=.90)
You can modify the color of the bands with level.shade
and level.shade2
.
See also “Prediction intervals” in the
plotmo vignette
(but note that plotmo
needs prediction intervals on new
data, whereas plotres
requires only that the model supports
prediction intervals on the training data).
The largest id.n
residuals will be labeled in the plot.
Default is 3
.
Special values TRUE
and -1
or mean all.
If id.n
is negative (but not -1
)
the id.n
most positive and most negative
residuals will be labeled in the plot.
A current implementation restriction is that id.n
is ignored
when there are more than ten thousand cases.
Residual labels.
Only used if id.n > 0
.
Default is the case names, or the case numbers if the cases are unnamed.
Color of the smooth line through the residual points.
Default is 2
, red. Use smooth.col=0
for no smooth line.
You can adjust the amount of smoothing with smooth.f
.
This gets passed as f
to lowess
.
The default is 2/3
.
Lower values make the line more wiggly.
Default is 0
, no grid.
Else add a background grid
of the specified color to the degree1 plots.
The special value grid.col=TRUE
is treated as "lightgray"
.
Default is 0
, no jitter.
Passed as factor
to jitter
to jitter the plotted points horizontally and vertically.
Useful for discrete variables and responses, where the residual points
tend to be overlaid.
One of NULL
, FALSE
, TRUE
, or 2
, as follows:
do.par=NULL
(default). Same as do.par=FALSE
if the
number of plots is one; else the same as TRUE
.
do.par=FALSE
. Use the current par
settings.
You can pass additional graphics parameters in the ``...
'' argument.
do.par=TRUE
. Start a new page and call par
as
appropriate to display multiple plots on the same page.
This automatically sets parameters like mfrow
and mar
.
You can pass additional graphics parameters in the ``...
'' argument.
do.par=2
. Like do.par=TRUE
but don't restore the
par
settings to their original state when
plotres
exits, so you can add something to the plot.
Overall caption. By default create the caption automatically.
Use caption=""
for no caption.
(Use main
to set the title of an individual plot.)
Default is 0
.
trace=1
(or TRUE
) for a summary trace (shows how
predict
and friends
are invoked for the model).
trace=2
for detailed tracing.
Number of points to be plotted.
A sample of npoints
is taken; the sample includes the biggest
twenty or so residuals.
The default is 3000 (not all, to avoid overplotting on large models).
Use npoints=TRUE
or -1
for all points.
Default is TRUE, meaning center the horizontal axis in the residuals plot, so asymmetry in the residual distribution is more obvious.
Type parameter passed first to residuals
and
if that fails to predict
.
For allowed values see the residuals
and predict
methods for
your object
(such as
residuals.rpart
or
predict.earth
).
By default, plotres
tries to automatically select a suitable
value for the model in question (usually "response"
),
but this will not always be correct.
Use trace=1
to see the type
argument passed to
residuals
and predict
.
Which column to use when residuals
or predict
returns
multiple columns.
This can be a column index or column name
(which may be abbreviated, partial matching is used).
The name of the object
for error and trace messages.
Used internally by plot.earth
.
Dot arguments are passed to the plot functions. Dot argument names, whether prefixed or not, should be specified in full and not abbreviated.
“Prefixed” arguments are passed directly to the associated function.
For example the prefixed argument pt.col="pink"
passes
col="pink"
to points()
, overriding the global
col
setting.
The prefixes recognized by plotres
are:
residuals. | passed to residuals |
predict. | passed to predict
(predict is called if the call to residuals fails) |
w1. | sent to the model-dependent plot for which=1 e.g. w1.col=2 |
pt. | modify the displayed points
e.g. pt.col=as.numeric(survived)+2 or pt.cex=.8 . |
smooth. | modify the smooth line e.g. smooth.col=0 or
smooth.f=.5 . |
level. | modify the interval bands, e.g. level.shade="gray" or level.shade2="lightblue" |
legend. | modify the displayed legend e.g. legend.cex=.9 |
cum. | modify the Cumulative Distribution plot
(arguments for plot.stepfun ) |
qq. | modify the QQ plot, e.g. qq.pch=1 |
qqline | modify the qqline in the QQ plot, e.g. qqline.col=0 |
label. | modify the point labels, e.g. label.cex=.9 or label.font=2 |
cook. | modify the Cook's Distance annotations.
This affects only the leverage plot
(versus=3 ) for lm models with standardize=TRUE .
e.g. cook.levels=c(.5, .8, 1) or cook.col=2 . |
caption. | modify the overall caption (see the caption argument)
e.g. caption.col=2 . |
par. | arguments for par
(only necessary if a par argument name clashes
with a plotres argument) |
The cex
argument is relative, so
specifying cex=1
is the same as not specifying cex
.
For backwards compatibility, some dot arguments are supported but not explicitly documented.
# we use lm in this example, but plotres is more useful for models
# that don't have a function like plot.lm for plotting residuals
lm.model <- lm(Volume~., data=trees)
plotres(lm.model)
Run the code above in your browser using DataLab