which.influence: Which Observations are Influential

Description

Creates a list with a component for each factor in the model. The names of the components are the factor names. Each component contains the observation identifiers of all observations that are "overly influential" with respect to that factor, meaning that \(|dfbetas| > u\) for at least one \(\beta_i\) associated with that factor, for a given cutoff. The default cutoff is .2. The fit must come from a function that has resid(fit, type="dfbetas") defined.

show.influence, written by Jens Oehlschlaegel-Akiyoshi, applies the result of which.influence to a data frame, usually the one used to fit the model, to report the results.

Usage

which.influence(fit, cutoff=.2)
show.influence(object, dframe, report=NULL, sig=NULL, id=NULL)

Value

show.influence returns a marked dataframe with the first column being a count of influence values

Arguments

fit: fit object
object: the result of which.influence
dframe: data frame containing observations pertinent to the model fit
cutoff: cutoff value
report: other columns of the data frame to report besides those corresponding to predictors that are influential for some observations
sig: runs results through signif with sig digits if sig is given
id: a character vector that labels rows of dframe if row.names were not used

Author

Frank Harrell
Department of Biostatistics, Vanderbilt University
fh@fharrell.com

Jens Oehlschlaegel-Akiyoshi
Center for Psychotherapy Research
Christian-Belser-Strasse 79a
D-70597 Stuttgart Germany
oehl@psyres-stuttgart.de

Examples

Run this code

#print observations in data frame that are influential,
#separately for each factor in the model
x1 <- 1:20
x2 <- abs(x1-10)
x3 <- factor(rep(0:2,length.out=20))
y  <- c(rep(0:1,8),1,1,1,1)
f  <- lrm(y ~ rcs(x1,3) + x2 + x3, x=TRUE,y=TRUE)
w <- which.influence(f, .55)
nam <- names(w)
d   <- data.frame(x1,x2,x3,y)
for(i in 1:length(nam)) {
 print(paste("Influential observations for effect of ",nam[i]),quote=FALSE)
 print(d[w[[i]],])
}

show.influence(w, d)  # better way to show results