cph
), parametric survival models (psm
),
binary and ordinal logistic models (lrm
) and ordinary least
squares (ols
). For survival models,
"predicted" means predicted survival probability at a single
time point, and "observed" refers to the corresponding Kaplan-Meier
survival estimate, stratifying on intervals of predicted survival, or,
if the polspline
package is installed, the predicted survival
probability as a function of transformed predicted survival probability
using the flexible hazard regression approach (see the val.surv
function for details). For logistic and linear models, a nonparametric
calibration curve is estimated over a sequence of predicted values. The
fit must have specified x=TRUE, y=TRUE
. The print
and
plot
methods for lrm
and ols
models (which use
calibrate.default
) print the mean absolute error in predictions,
the mean squared error, and the 0.9 quantile of the absolute error.
Here, error refers to the difference between the predicted values and
the corresponding bias-corrected calibrated values.Below, the second, third, and fourth invocations of calibrate
are, respectively, for ols
and lrm
, cph
, and
psm
. The first and second plot
invocation are
respectively for lrm
and ols
fits or all other fits.
calibrate(fit, ...)
## S3 method for class 'default':
calibrate(fit, predy,
method=c("boot","crossvalidation",".632","randomization"),
B=40, bw=FALSE, rule=c("aic","p"),
type=c("residual","individual"),
sls=.05, aics=0, force=NULL, estimates=TRUE, pr=FALSE, kint,
smoother="lowess", digits=NULL, ...)
## S3 method for class 'cph':
calibrate(fit, cmethod=c('hare', 'KM'),
method="boot", u, m=150, pred, cuts, B=40,
bw=FALSE, rule="aic", type="residual", sls=0.05, aics=0, force=NULL,
estimates=TRUE,
pr=FALSE, what="observed-predicted", tol=1e-12, maxdim=5, ...)
## S3 method for class 'psm':
calibrate(fit, cmethod=c('hare', 'KM'),
method="boot", u, m=150, pred, cuts, B=40,
bw=FALSE,rule="aic",
type="residual", sls=.05, aics=0, force=NULL, estimates=TRUE,
pr=FALSE, what="observed-predicted", tol=1e-12, maxiter=15,
rel.tolerance=1e-5, maxdim=5, ...)## S3 method for class 'calibrate':
print(x, B=Inf, \dots)
## S3 method for class 'calibrate.default':
print(x, B=Inf, \dots)
## S3 method for class 'calibrate':
plot(x, xlab, ylab, subtitles=TRUE, conf.int=TRUE,
cex.subtitles=.75, riskdist=TRUE, add=FALSE,
scat1d.opts=list(nhistSpike=200), ...)
## S3 method for class 'calibrate.default':
plot(x, xlab, ylab, xlim, ylim,
legend=TRUE, subtitles=TRUE, scat1d.opts=NULL, \dots)
ols
, lrm
, cph
or psm
calibrate
validate
.
For print.calibrate
, B
is an
upper limit on the number of resamples for which
information is printed about which variables were selected in each
model re-ficmethod='hare'
to use the
hare
function in the polspline
package. Specify
cmethod='KM'
to use less precision scph
fits, you must have specified surv=TRUE,
time.inc=u
, where u
is the constant specifying the time to
predict.u
-time units survival into intervals containing
m
subjects on the average (for survival models only)datadist
are used, which for large sample size is the 10th
smallest to the 10th largest predictem
and cuts
(for survival models only)TRUE
to print intermediate results for each re-sample"observed-predicted"
, meaning to estimate optimism
in this difference. This is preferred as it accounts for skewed
distributions of predicted probabilities in outer intervals. You can
also specify "observed"
. This1e-12
)hare
psm
, this is passed to
survreg.control
(default is 15 iterations)survreg.control
for psm
(default is 1e-5).lrm
,
ols
). Default is 50 equally spaced points between the 5th
smallest and the 5th largest predicted values. For lrm
the
predicted values are probabilitieskint
to specify the
intercept to use, e.g., kint=2
means to calibrate $Prob(Y\geq
b)$, where $b$ is the second level of $Y$y
. The default is to
use lowess(x, y, iter=0)
.digits
digits before passing to the smoother. Occasionally,
large predicted values on the logit scale will lead to predicted
probabilities very near 1 that should be treated as 1, and the
predab.resample
, such as group
,
cluster
, and subset
.
Also, other arguments for plot
.FALSE
to suppress subtitles in plot describing method and for lrm
and ols
the mean absolute error and original sample sizeFALSE
to suppress plotting 0.95 confidence intervals for
Kaplan-Meier estimatesFALSE
to suppress the distribution of
predicted risks (survival probabilities) from being plottedTRUE
to add the calibration plot to an existing
plotscat1d
if
riskdist=TRUE
. See scat1d
.FALSE
to suppress legends (for lrm
, ols
only) on the calibration plot, or specify a list with elements x
and y
containing the coordinates of the upper left corner of the
legend. By d"calibrate"
or
"calibrate.default"
.
plot.calibrate.default
invisibly returns the vector of estimated
prediction errors corresponding to the dataset used to fit the model.pred.obs
or .orig.cal
penalty
and penalty.scale
parameters are used during
validation.validate
, predab.resample
,
groupkm
, errbar
,
scat1d
, cph
, psm
,
lowess
set.seed(1)
d.time <- rexp(200)
x1 <- runif(200)
x2 <- factor(sample(c('a','b','c'),200,TRUE))
f <- cph(Surv(d.time) ~ pol(x1,2)*x2, x=TRUE, y=TRUE, surv=TRUE, time.inc=2)
#or f <- psm(S ~ \dots)
pa <- 'polspline' %in% row.names(installed.packages())
if(pa) {
cal <- calibrate(f, u=2, B=20) # cmethod='hare'
plot(cal)
}
cal <- calibrate(f, u=2, cmethod='KM', m=50, B=20) # usually B=200 or 300
plot(cal, add=pa)
y <- sample(0:2, 200, TRUE)
x1 <- runif(200)
x2 <- runif(200)
x3 <- runif(200)
x4 <- runif(200)
f <- lrm(y ~ x1+x2+x3*x4, x=TRUE, y=TRUE)
cal <- calibrate(f, kint=2, predy=seq(.2,.8,length=60),
group=y)
# group= does k-sample validation: make resamples have same
# numbers of subjects in each level of y as original sample
plot(cal)
#See the example for the validate function for a method of validating
#continuation ratio ordinal logistic models. You can do the same
#thing for calibrate
Run the code above in your browser using DataLab