Learn R Programming

msmtools (version 2.0.0)

survplot: Plot and get survival data from a multi-state model

Description

Plot the fitted survival probability computed over a msm model and compare it with the Kaplan-Meier. Fast build and return the underlying data structures.

Usage

survplot(
  x,
  from = 1,
  to = NULL,
  range = NULL,
  covariates = "mean",
  exacttimes = TRUE,
  times,
  grid = 100L,
  km = FALSE,
  out = c("none", "fitted", "km", "all"),
  ci = c("none", "normal", "bootstrap"),
  interp = c("start", "midpoint"),
  B = 100L,
  ci_km = c("none", "plain", "log", "log-log", "logit", "arcsin")
)

Arguments

x

A msm object.

from

State from which to compute the estimated survival. Default to state 1.

to

The absorbing state to which compute the estimated survival. Default to the highest state found by absorbing.msm.

range

A numeric vector of two elements which gives the time range of the plot.

covariates

Covariate values for which to evaluate the expected probabilities. These can either be: the string "mean", denoting the means of the covariates in the data (default), the number 0, indicating that all the covariates should be set to zero, or a list of values, with optional names. For example: list (75, 1) where the order of the list follows the order of the covariates originally given in the model formula, or a named list: list (age = 75, gender = "M").

exacttimes

If TRUE (default) then transition times are known and exact. This is inherited from msm and should be set the same way.

times

An optional numeric vector giving the times at which to compute the fitted survival.

grid

An integer specifying the grid points at which to compute the fitted survival (see 'Details'). If times is passed, grid is ignored. Default to 100 points.

km

If TRUE, then the Kaplan-Meier curve is plotted. Default is FALSE.

out

A character vector specifying what the function has to return. Accepted values are "none" (default) to return just the plot, "fitted" to return the fitted survival curve only, "km" to return the Kaplan-Meier only, "all" to return all of the above.

ci

A character vector with the type of confidence intervals to compute for the fitted survival curve. Specify either "none" (default), for no confidence intervals, "normal" or "bootstrap", for confidence intervals computed with the respective method in pmatrix.msm. This is very computationally-intensive, since intervals must be computed at a series of times.

interp

If "start" (default), then the entry time into the absorbing state is assumed to be the time it is first observed in the data. If "midpoint", then the entry time into the absorbing state is assumed to be halfway between the time it is first observed and the previous observation time. This is generally more reasonable for "progressive" models with observations at arbitrary times.

B

Number of bootstrap or normal replicates for the confidence interval. The default is 100 rather than the usual 1000, since these plots are for rough diagnostic purposes.

ci_km

A character vector with the type of confidence intervals to compute for the Kaplan-Meier curve. Specify either "none", "plain", "log", "log-log", "logit", or "arcsin", as coded in survfit.

Value

When out = "none", a gg/ggplot object is returned. If out is anything else, then a named list is returned. The Kaplan-Meier data can be accessed with $km while the estimated survival data with $fitted. If out = "all", the plot, the Kaplan-Meier and the estimated curve are returned.

Details

The function is a wrapper of plot.survfit.msm and does more things. survplot manages correctly the plot of a fitted survival in an exact times framework (when exacttimes = TRUE) by just resetting the time scale and looking at the follow-up time. It can quickly build and return to the user the data structures used to compute the Kaplan-Meier and the fitted survival probability by specifying out = "all".

The user can defined custom times (through times) or let survplot choose them on its own (through grid). In the latter case, survplot looks for the follow-up time and divides it by grid. The higher it is, the finer the grid will be so that computing the fitted survival will take longer, but will be more precise.

References

Titman, A. and Sharples, L.D. (2010). Model diagnostics for multi-state models, Statistical Methods in Medical Research, 19, 621-651.

Titman, A. and Sharples, L.D. (2008). A general goodness-of-fit test for Markov and hidden Markov models, Statistics in Medicine, 27, 2177-2195.

Jackson, C.H. (2011). Multi-State Models for Panel Data: The msm Package for R. Journal of Statistical Software, 38(8), 1-29. URL https://www.jstatsoft.org/v38/i08/.

See Also

plot.survfit.msm msm, pmatrix.msm, setDF

Examples

Run this code
# NOT RUN {
data( hosp )

# augmenting the data
hosp_augmented = augment( data = hosp, data_key = subj, n_events = adm_number,
                          pattern = label_3, t_start = dateIN, t_end = dateOUT,
                          t_cens = dateCENS )

# let's define the initial transition matrix for our model
Qmat = matrix( data = 0, nrow = 3, ncol = 3, byrow = TRUE )
Qmat[ 1, 1:3 ] = 1
Qmat[ 2, 1:3 ] = 1
colnames( Qmat ) = c( 'IN', 'OUT', 'DEAD' )
rownames( Qmat ) = c( 'IN', 'OUT', 'DEAD' )

# attaching the msm package and running the model using
# gender and age as covariates
library( msm )
msm_model = msm( status_num ~ augmented_int, subject = subj,
                 data = hosp_augmented, covariates = ~ gender + age,
                 exacttimes = TRUE, gen.inits = TRUE, qmatrix = Qmat,
                 method = 'BFGS', control = list( fnscale = 6e+05, trace = 0,
                 REPORT = 1, maxit = 10000 ) )

# plotting the fitted and empirical survival from state = 1
theplot = survplot( x = msm_model, km = TRUE )

# plotting the fitted and empirical survival from state = 2 and
and returning both the fitted and the empirical curve
out_all = survplot( msm_model, from = 2, km = TRUE, out = "all" )

# }
# NOT RUN {
# }

Run the code above in your browser using DataLab