Case statistics for regression analysis.
case.lm
calculates the statistics.
plot.case
plots the cases, one statistic per panel, and
illustrates and flags all observations for which the standard
thresholds are exceeded. plot.case
returns an object with
class c("trellis.case", "trellis")
containing the plot and the row.names of the
flagged observations. The object is printed by a method which
displays the set of graphs and prints the list of flagged cases.
panel.case
is a panel function for plot.case
.
case(fit, ...)
# S3 method for lm
case(fit, lms = summary.lm(fit), lmi = lm.influence(fit), ...)# S3 method for case
plot(x, fit,
which=c("stu.res","si","h","cook","dffits",
dimnames(x)[[2]][-(1:8)]), ##DFBETAS
between.in=list(y=4, x=9),
cex.threshold=1.2,
main.in=list(
paste(deparse(fit$call), collapse=""),
cex=main.cex),
sigma.in=summary.lm(fit)$sigma,
p.in=summary.lm(fit)$df[1]-1,
main.cex=NULL,
...)
panel.case(x, y, subscripts, rownames, group.names,
thresh, case.large,
nn, pp, ss, cex.threshold,
...)
"lm"
object computed with x=TRUE
summary.lm(fit)
lm.influence(fit)
In plot.case
, the matrix output from case.lm
containing case diagnostics on each observation in the original
dataset.
In panel.case
, the x variable to be plotted
In plot.case
, the names of the columns of x
that are to be graphed.
between
trellis/lattice argument.
Multiplier for cex
for the threshold values.
main
title for xyplot
. The default main title
displays the linear model formula from fit
.
standard error for the fit
.
The number of degrees of freedom associated with the fitted model.
cex
for main title.
other arguments to xyplot
the y variable to be plotted.
Named list of lists. Each list contains the components threshold ($y$-locations where a reference line will be drawn), thresh.label (the right-axis labels for the reference lines), thresh.id (the bounds defining "Noteworthy Observations").
Named list of "Noteworthy Observations".
Number of rows in original dataset.
The number of degrees of freedom associated with the fitted model.
Standard error for the fit
.
trellis/lattice argument, position in the reshaped
dataset constructed by plot.case
before calling xyplot
.
row name in the original data.frame.
names of the individual statistics.
case.lm
returns a matrix, with one row for each observation
in the original dataset. The columns contain the diagnostic statistics:
e
(residuals),
h
* (hat diagonals),
si
* (deleted standard deviation),
sta.res
(standardized residuals),
stu.res
* (Studentized deleted resididuals),
dffit
(difference in fits, change in predicted y when
observation i is deleted),
dffits
* (standardized difference in fits, standardized change
in predicted y when observation i is deleted),
cook
* (Cook's distance),
and DFBETAs* (standardized difference in regression coefficients when
observation i is deleted, one for each column of the x-matrix,
including the intercept).
plot.case
returns a c("trellis.case", "trellis")
object
containing the plot
(including the starred columns by default) and also retains the
row.names of the flagged observations in the
$panel.args.common$case.large
component. The print method for the c("trellis.case",
"trellis")
object prints the graph and the list of flagged observations.
panel.case
is a panel function for plot.case
.
lm.influence
is part of S-Plus and R
case.lm
and plot.case
are based on:
Section 4.3.3 "Influence of Individual Obervations
in Chambers and Hastie", Statistical Models in S.
Heiberger, Richard M. and Holland, Burt (2015). Statistical Analysis and Data Display: An Intermediate Course with Examples in R. Second Edition. Springer-Verlag, New York. https://link.springer.com/us/book/9781493921218
# NOT RUN {
data(kidney)
kidney2.lm <- lm(clearance ~ concent + age + weight + concent*age,
data=kidney,
na.action=na.exclude) ## recommended
kidney2.case <- case(kidney2.lm)
## this picture looks much better in portrait, specification is device dependent
plot(kidney2.case, kidney2.lm, par.strip.text=list(cex=.9),
layout=c(2,3))
# }
Run the code above in your browser using DataLab