summary.rms: Summary of Effects in Model

Description

summary.rms forms a summary of the effects of each factor. When summary is used to estimate odds or hazard ratios for continuous variables, it allows the levels of interacting factors to be easily set, as well as allowing the user to choose the interval for the effect. This method of estimating effects allows for nonlinearity in the predictor. Factors requiring multiple parameters are handled, as summary obtains predicted values at the needed points and takes differences. By default, inter-quartile range effects (odds ratios, hazards ratios, etc.) are printed for continuous factors, and all comparisons with the reference level are made for categorical factors. print.summary.rms prints the results, latex.summary.rms and html.summary.rms typeset the results, and plot.summary.rms plots shaded confidence bars to display the results graphically. The longest confidence bar on each page is labeled with confidence levels (unless this bar has been ignored due to clip). By default, the following confidence levels are all shown: .9, .95, and .99, using blue of different transparencies. The plot method currently ignores bootstrap and Bayesian highest posterior density intervals but approximates intervals based on standard errors. The html method is for use with R Markdown using html.

The print method will call the latex or html method if options(prType=) is set to "latex" or "html". For "latex" printing through print(), the LaTeX table environment is turned off.

If usebootcoef=TRUE and the fit was run through bootcov, the confidence intervals are bootstrap nonparametric percentile confidence intervals, basic bootstrap, or BCa intervals, obtained on contrasts evaluated on all bootstrap samples.

If options(grType='plotly') is in effect and the plotly package is installed, plot is used instead of base graphics to draw the point estimates and confidence limits when the plot method for summary is called. Colors and other graphical arguments to plot.summary are ignored in this case. Various special effects are implemented such as only drawing 0.95 confidence limits by default but including a legend that allows the other CLs to be activated. Hovering over point estimates shows adjustment values if there are any. nbar is not implemented for plotly.

Usage

# S3 method for rms
summary(object, ..., ycut=NULL, est.all=TRUE, antilog,
conf.int=.95, abbrev=FALSE, vnames=c("names","labels"),
conf.type=c('individual','simultaneous'),
usebootcoef=TRUE, boot.type=c("percentile","bca","basic"),
posterior.summary=c('mean', 'median', 'mode'), verbose=FALSE)
# S3 method for summary.rms
print(x, ..., table.env=FALSE)
# S3 method for summary.rms
latex(object, title, table.env=TRUE, ...)
# S3 method for summary.rms
html(object, digits=4, dec=NULL, ...)
# S3 method for summary.rms
plot(x, at, log=FALSE,
    q=c(0.9, 0.95, 0.99), xlim, nbar, cex=1, nint=10,
    cex.main=1, clip=c(-1e30,1e30), main,
    col=rgb(red=.1,green=.1,blue=.8,alpha=c(.1,.4,.7)),
    col.points=rgb(red=.1,green=.1,blue=.8,alpha=1), pch=17,
    lwd=if(length(q) == 1) 3 else 2 : (length(q) + 1), digits=4,
    declim=4, ...)

Value

For summary.rms, a matrix of class summary.rms

with rows corresponding to factors in the model and columns containing the low and high values for the effects, the range for the effects, the effect point estimates (difference in predicted values for high and low factor values), the standard error of this effect estimate, and the lower and upper confidence limits. If fit$scale.pred has a second level, two rows appear for each factor, the second corresponding to anti--logged effects. Non--categorical factors are stored first, and effects for any categorical factors are stored at the end of the returned matrix. scale.pred and adjust. adjust

is a character string containing levels of adjustment variables, if there are any interactions. Otherwise it is "". latex.summary.rms returns an object of class c("latex","file"). It requires the latex function in Hmisc.

Arguments

object

a rms fit object. Either options(datadist) should have been set before the fit, or datadist() and options(datadist) run before summary. For latex is the result of summary.

...

For summary, omit list of variables to estimate effects for all predictors. Use a list of variables of the form age, sex to estimate using default ranges. Specify age=50 for example to adjust age to 50 when testing other factors (this will only matter for factors that interact with age). Specify e.g. age=c(40,60) to estimate the effect of increasing age from 40 to 60. Specify age=c(40,50,60) to let age range from 40 to 60 and be adjusted to 50 when testing other interacting factors. For category factors, a single value specifies the reference cell and the adjustment value. For example, if treat has levels "a", "b" and "c" and treat="b" is given to summary, treatment a will be compared to b and c will be compared to b. Treatment b will be used when estimating the effect of other factors. Category variables can have category labels listed (in quotes), or an unquoted number that is a legal level, if all levels are numeric. You need only use the first few letters of each variable name - enough for unique identification. For variables not defined with datadist, you must specify 3 values, none of which are NA.

Also represents other arguments to pass to latex, is ignored for print and plot.

ycut

must be specified if the fit is a partial proportional odds model. Specifies the single value of the response variable used to estimate ycut-specific regression effects, e.g., odds ratios

est.all

Set to FALSE to only estimate effects of variables listed. Default is TRUE.

antilog

Set to FALSE to suppress printing of anti-logged effects. Default is TRUE if the model was fitted by lrm or cph. Antilogged effects will be odds ratios for logistic models and hazard ratios for proportional hazards models.

conf.int

Defaults to .95 for 95% confidence intervals of effects.

abbrev

Set to TRUE to use the abbreviate function to shorten factor levels for categorical variables in the model.

vnames

Set to "labels" to use variable labels to label effects. Default is "names" to use variable names.

conf.type

The default type of confidence interval computed for a given individual (1 d.f.) contrast is a pointwise confidence interval. Set conf.type="simultaneous" to use the multcomp package's glht and confint functions to compute confidence intervals with simultaneous (family-wise) coverage, thus adjusting for multiple comparisons. Contrasts are simultaneous only over groups of intervals computed together.

usebootcoef

If fit was the result of bootcov but you want to use the bootstrap covariance matrix instead of the nonparametric percentile, basic, or BCa methods for confidence intervals (which uses all the bootstrap coefficients), specify usebootcoef=FALSE.

boot.type

set to 'bca' to compute BCa confidence limits or to 'basic' to use the basic bootstrap. The default is to compute percentile intervals.

posterior.summary

set to 'mode' or 'median' to use the posterior mean/median instead of the mean for point estimates of contrasts

verbose

set to TRUE when conf.type='simultaneous' to get output describing scope of simultaneous adjustments

x

result of summary

title

title to pass to latex. Default is name of fit object passed to summary prefixed with "summary".

table.env

see latex

digits,dec

for html.summary.rms; digits is the number of significant digits for printing for effects, standard errors, and confidence limits. It is ignored if dec is given. The statistics are rounded to dec digits to the right of the decimal point of dec is given. digits is also the number of significant digits to format numeric hover text and labels for plotly.

declim

number of digits to the right of the decimal point to which to round confidence limits for labeling axes

at

vector of coordinates at which to put tick mark labels on the main axis. If log=TRUE, at should be in anti-log units.

log

Set to TRUE to plot on $X\beta$ scale but labeled with anti-logs.

q

scalar or vector of confidence coefficients to depict

xlim

X-axis limits for plot in units of the linear predictors (log scale if log=TRUE). If at is specified and xlim is omitted, xlim is derived from the range of at.

nbar

Sets up plot to leave room for nbar horizontal bars. Default is the number of non-interaction factors in the model. Set nbar to a larger value to keep too much surrounding space from appearing around horizontal bars. If nbar is smaller than the number of bars, the plot is divided into multiple pages with up to nbar bars on each page.

cex

cex parameter for factor labels.

nint

Number of tick mark numbers for pretty.

cex.main

cex parameter for main title. Set to 0 to suppress the title.

clip

confidence limits outside the interval c(clip[1], clip[2]) will be ignored, and clip also be respected when computing xlim when xlim is not specified. clip should be in the units of fun(x). If log=TRUE, clip should be in $X\beta$ units.

main

main title. Default is inferred from the model and value of log, e.g., "log Odds Ratio".

col

vector of colors, one per value of q

col.points

color for points estimates

pch

symbol for point estimates. Default is solid triangle.

lwd

line width for confidence intervals, corresponding to q

Author

Frank Harrell
Hui Nian
Department of Biostatistics, Vanderbilt University
fh@fharrell.com

Examples

Run this code

n <- 1000    # define sample size
set.seed(17) # so can reproduce the results
age            <- rnorm(n, 50, 10)
blood.pressure <- rnorm(n, 120, 15)
cholesterol    <- rnorm(n, 200, 25)
sex            <- factor(sample(c('female','male'), n,TRUE))
label(age)            <- 'Age'      # label is in Hmisc
label(cholesterol)    <- 'Total Cholesterol'
label(blood.pressure) <- 'Systolic Blood Pressure'
label(sex)            <- 'Sex'
units(cholesterol)    <- 'mg/dl'   # uses units.default in Hmisc
units(blood.pressure) <- 'mmHg'


# Specify population model for log odds that Y=1
L <- .4*(sex=='male') + .045*(age-50) +
  (log(cholesterol - 10)-5.2)*(-2*(sex=='female') + 2*(sex=='male'))
# Simulate binary y to have Prob(y=1) = 1/[1+exp(-L)]
y <- ifelse(runif(n) < plogis(L), 1, 0)


ddist <- datadist(age, blood.pressure, cholesterol, sex)
options(datadist='ddist')


fit <- lrm(y ~ blood.pressure + sex * (age + rcs(cholesterol,4)))


s <- summary(fit)                # Estimate effects using default ranges
                                 # Gets odds ratio for age=3rd quartile
                                 # compared to 1st quartile
if (FALSE) {
latex(s)                         # Use LaTeX to print nice version
latex(s, file="")                # Just write LaTeX code to console
html(s)                          # html/LaTeX to console for knitr
# Or:
options(prType='latex')
summary(fit)                     # prints with LaTeX, table.env=FALSE
options(prType='html')
summary(fit)                     # prints with html
}
summary(fit, sex='male', age=60) # Specify ref. cell and adjustment val
summary(fit, age=c(50,70))       # Estimate effect of increasing age from
                                 # 50 to 70
s <- summary(fit, age=c(50,60,70)) 
                                 # Increase age from 50 to 70, adjust to
                                 # 60 when estimating effects of other factors
#Could have omitted datadist if specified 3 values for all non-categorical
#variables (1 value for categorical ones - adjustment level)
plot(s, log=TRUE, at=c(.1,.5,1,1.5,2,4,8))


options(datadist=NULL)