Learn R Programming

riskRegression (version 2023.12.21)

predictCox: Fast computation of survival probabilities, hazards and cumulative hazards from Cox regression models

Description

Fast routine to get baseline hazards and subject specific hazards as well as survival probabilities from a survival::coxph or rms::cph object

Usage

predictCox(
  object,
  times,
  newdata = NULL,
  centered = TRUE,
  type = c("cumhazard", "survival"),
  keep.strata = TRUE,
  keep.times = TRUE,
  keep.newdata = FALSE,
  keep.infoVar = FALSE,
  se = FALSE,
  band = FALSE,
  iid = FALSE,
  confint = (se + band) > 0,
  diag = FALSE,
  average.iid = FALSE,
  store.iid = "full"
)

Arguments

object

The fitted Cox regression model object either obtained with coxph (survival package) or cph (rms package).

times

[numeric vector] Time points at which to return the estimated hazard/cumulative hazard/survival.

newdata

[data.frame or data.table] Contain the values of the predictor variables defining subject specific predictions. Should have the same structure as the data set used to fit the object.

centered

[logical] If TRUE return prediction at the mean values of the covariates fit$mean, if FALSE return a prediction for all covariates equal to zero. in the linear predictor. Will be ignored if argument newdata is used. For internal use.

type

[character vector] the type of predicted value. Choices are

  • "hazard" the baseline hazard function when argument newdata is not used and the hazard function when argument newdata is used.

  • "cumhazard" the cumulative baseline hazard function when argument newdata is not used and the cumulative hazard function when argument newdata is used.

  • "survival" the survival baseline hazard function when argument newdata is not used and the cumulative hazard function when argument newdata is used.

Several choices can be combined in a vector of strings that match (no matter the case) strings "hazard","cumhazard", "survival".

keep.strata

[logical] If TRUE add the (newdata) strata to the output. Only if there any.

keep.times

[logical] If TRUE add the evaluation times to the output.

keep.newdata

[logical] If TRUE add the value of the covariates used to make the prediction in the output list.

keep.infoVar

[logical] For internal use.

se

[logical] If TRUE compute and add the standard errors to the output.

band

[logical] If TRUE compute and add the quantiles for the confidence bands to the output.

iid

[logical] If TRUE compute and add the influence function to the output.

confint

[logical] If TRUE compute and add the confidence intervals/bands to the output. They are computed applying the confint function to the output.

diag

[logical] when FALSE the hazard/cumlative hazard/survival for all observations at all times is computed, otherwise it is only computed for the i-th observation at the i-th time.

average.iid

[logical] If TRUE add the average of the influence function over newdata to the output.

store.iid

[character] Implementation used to estimate the influence function and the standard error. Can be "full" or "minimal".

Author

Brice Ozenne broz@sund.ku.dk, Thomas A. Gerds tag@biostat.ku.dk

Details

When the argument newdata is not specified, the function computes the baseline hazard estimate. See (Ozenne et al., 2017) section "Handling of tied event times".

Otherwise the function computes survival probabilities with confidence intervals/bands. See (Ozenne et al., 2017) section "Confidence intervals and confidence bands for survival probabilities". The survival is computed using the exponential approximation (equation 3).

A detailed explanation about the meaning of the argument store.iid can be found in (Ozenne et al., 2017) Appendix B "Saving the influence functions".

The function is not compatible with time varying predictor variables.

The centered argument enables us to reproduce the results obtained with the basehaz function from the survival package but should not be modified by the user.

The iid decomposition is output using an array containing the value of the influence of each subject used to fit the object (dim 1), for each subject in newdata (dim 3), and each time (dim 2).

References

Brice Ozenne, Anne Lyngholm Sorensen, Thomas Scheike, Christian Torp-Pedersen and Thomas Alexander Gerds. riskRegression: Predicting the Risk of an Event using Cox Regression Models. The R Journal (2017) 9:2, pages 440-460.

See Also

confint.predictCox to compute confidence intervals/bands. autoplot.predictCox to display the predictions.

Examples

Run this code

library(survival)
library(data.table)

#### generate data ####
set.seed(10)
d <- sampleData(40,outcome="survival") ## training dataset
nd <- sampleData(4,outcome="survival") ## validation dataset
d$time <- round(d$time,1) ## create tied events
# table(duplicated(d$time))

#### stratified Cox model ####
fit <- coxph(Surv(time,event)~X1 + strata(X2) + X6,
             data=d, ties="breslow", x = TRUE, y = TRUE)

## compute the baseline cumulative hazard
fit.haz <- predictCox(fit)
cbind(survival::basehaz(fit), fit.haz$cumhazard)

## compute individual specific cumulative hazard and survival probabilities 
fit.pred <- predictCox(fit, newdata=nd, times=c(3,8), se = TRUE, band = TRUE)
fit.pred

####  other examples ####
# one strata variable
fitS <- coxph(Surv(time,event)~strata(X1)+X2,
              data=d, ties="breslow", x = TRUE, y = TRUE)

predictCox(fitS)
predictCox(fitS, newdata=nd, times = 1)

# two strata variables
set.seed(1)
d$U=sample(letters[1:5],replace=TRUE,size=NROW(d))
d$V=sample(letters[4:10],replace=TRUE,size=NROW(d))
nd$U=sample(letters[1:5],replace=TRUE,size=NROW(nd))
nd$V=sample(letters[4:10],replace=TRUE,size=NROW(nd))
fit2S <- coxph(Surv(time,event)~X1+strata(U)+strata(V)+X2,
              data=d, ties="breslow", x = TRUE, y = TRUE)

cbind(survival::basehaz(fit2S),predictCox(fit2S,type="cumhazard")$cumhazard)
predictCox(fit2S)
predictCox(fitS, newdata=nd, times = 3)

# left truncation
test2 <- list(start=c(1,2,5,2,1,7,3,4,8,8), 
              stop=c(2,3,6,7,8,9,9,9,14,17), 
              event=c(1,1,1,1,1,1,1,0,0,0), 
              x=c(1,0,0,1,0,1,1,1,0,0)) 
m.cph <- coxph(Surv(start, stop, event) ~ 1, test2, x = TRUE)
as.data.table(predictCox(m.cph))

basehaz(m.cph)

Run the code above in your browser using DataLab