ci.Crisk: Compute cumulative risks and expected sojourn times from models for cause-specific rates.

Description

Consider a list of parametric models for rates of competing events, such as different causes of death, A, B, C, say. From estimates of the cause-specific rates we can compute 1) the cumulative risk of being in each state ('Surv' (=no event) and A, B and C) at different times, 2) the stacked cumulative rates such as A, A+C, A+C+Surv and 3) the expected (truncated) sojourn times in each state up to each time point.

This can be done by simple numerical integration using estimates from models for the cause specific rates. But the standard errors of the results are analytically intractable.

The function ci.Crisk computes estimates with confidence intervals using simulated samples from the parameter vectors of supplied model objects. Some call this a parametric bootstrap.

The times and other covariates determining the cause-specific rates must be supplied in a data frame which will be used for predicting rates for all transitions.

Usage

ci.Crisk(mods,
           nd,
         tnam = names(nd)[1],
           nB = 1000,
         perm = length(mods):0 + 1,
        alpha = 0.05, 
      sim.res = 'none')

Value

If sim.res='none' a named list with 4 components, the first 3 are 3-way arrays classified by time, state and estimate/confidence interval:

Crisk Cumulative risks for the length(mods) events and the survival
Srisk Stacked versions of the cumulative risks
Stime Sojourn times in each states
time Endpoints of intervals. It is just the numerical version of the names of the first dimension of the three arrays

All three arrays have (almost) the same dimensions:

time, named as tnam; endpoints of intervals. Length nrow(nd).
cause. The arrays Crisk and Stime have values "Surv" plus the names of the list mods (first argument). Srisk has length length(mod), with each level representing a cumulative sum of cumulative risks, in order indicated by the perm argument.
Unnamed, ci.50%, ci.2.5%, ci.97.5% representing quantiles of the quantities derived from the bootstrap samples. If alpha is different from 0.05, names are of course different.

If sim.res='rates' the function returns bootstrap samples of rates for each cause as an array classified by time, cause and bootstrap sample.

If sim.res='crisk' the function returns bootstrap samples of cumulative risks for each cause (including survival) as an array classified by time, state (= causes + surv) and bootstrap sample.

Arguments

mods: A named list of glm/gam model objects representing the cause-specific rates. If the list is not named the function will crash. The names will be used as names for the states (competing risks), while the state without any event will be called "Surv".
nd: A data frame of prediction points and covariates to be used on all models supplied in mods.
tnam: Name of the column in nd which is the time scale.It must represent endpoints of equidistant intervals.
nB: Scalar. The number of simulations from the (posterior) distribution of the model parameters to be used in computing confidence limits.
perm: Numerical vector of length length(mods)+1 indicating the order in which states are to be stacked. The 'Surv' state is taken to be the first, the remaining in the reverse order supplied in the mods argument. The default is therefore to stack with the survival as the first, which may not be what you normally want.
alpha: numeric. 1 minus the confidence level used in calculating the c.i.s
sim.res: Character. What simulation samples should be returned. If 'none' (the default) the function returns a list of 3 arrays (see under 'value'). If 'rates' it returns an array of dimension nrow(nd) x length(mod) x nB of bootstrap samples of the rates. If 'crisk' it returns an array of dimension nrow(nd) x length(mod)+1 x nB of bootstrap samples of the cumulative rates. Only the first letter matters, regardless of whether it is in upper lower case.

Author

Bendix Carstensen, http://bendixcarstensen.com

Examples

Run this code

library(Epi)
data(DMlate)

# A Lexis object for survival
Ldm <- Lexis(entry = list( per = dodm,
                           age = dodm-dobth, 
                           tfd = 0 ),
              exit = list( per = dox ),
       exit.status = factor( !is.na(dodth), labels = c("DM","Dead") ),
              data = DMlate[sample(1:nrow(DMlate),1000),] )
summary(Ldm, timeScales = TRUE)

# Cut at OAD and Ins times
Mdm <- mcutLexis(Ldm,
                  wh = c('dooad','doins'),
          new.states = c('OAD','Ins'),
          seq.states = FALSE,
                ties = TRUE)
summary(Mdm$lex.dur)

# restrict to DM state and split
Sdm <- splitLexis(factorize(subset(Mdm, lex.Cst == "DM")),
                  time.scale = "tfd",
                  breaks = seq(0,20,1/12))
summary(Sdm)
summary(Relevel(Sdm, c(1, 4, 2, 3)))

boxes(Relevel(Sdm, c(1, 4, 2, 3)), 
      boxpos = list(x = c(15, 85, 80, 15),
                    y = c(85, 85, 20, 15)),
      scale.R = 100)

# glm models for the cause-specific rates
system.time(
mD <- glm.Lexis(Sdm, ~ Ns(tfd, knots=0:6*2), to = 'Dead') )
system.time(
mO <- glm.Lexis(Sdm, ~ Ns(tfd, knots=0:6*2), to = 'OAD' ) )
system.time(
mI <- glm.Lexis(Sdm, ~ Ns(tfd, knots=0:6*2), to = 'Ins' ) )

# intervals for calculation of predicted rates
int <- 1 / 100
nd <- data.frame(tfd = seq(0, 10, int)) # not the same as the split, 
                                        # and totally unrelated to it

# cumulaive risks with confidence intervals
# (too few timepoints, too few simluations)
system.time(
res <- ci.Crisk(list(OAD = mO, 
                     Ins = mI, 
                    Dead = mD),
                            nd = data.frame(tfd = 0:100 / 10),
                            nB = 100,
                          perm = 4:1))
str(res)