simmulti.msm: Simulate multiple trajectories from a multi-state Markov model with arbitrary observation times

Description

Simulate a number of individual realisations from a continuous-time Markov process. Observations of the process are made at specified arbitrary times for each individual, giving panel-observed data.

Usage

simmulti.msm(
  data,
  qmatrix,
  covariates = NULL,
  death = FALSE,
  start,
  ematrix = NULL,
  misccovariates = NULL,
  hmodel = NULL,
  hcovariates = NULL,
  censor.states = NULL,
  drop.absorb = TRUE
)

Value

A data frame with columns,

subject: Subject identification indicators
time: Observation times
state: Simulated (true) state at the corresponding time
obs: Observed outcome at the corresponding time, if ematrix or hmodel was supplied
keep: Row numbers of the original data. Useful when drop.absorb=TRUE, to show which rows were not dropped

plus any supplied covariates.

Arguments

data

A data frame with a mandatory column named time, representing observation times. The optional column named subject, corresponds to subject identification numbers. If not given, all observations are assumed to be on the same individual. Observation times should be sorted within individuals. The optional column named cens indicates the times at which simulated states should be censored. If cens==0 then the state is not censored, and if cens==k, say, then all simulated states at that time which are in the set censor.states are replaced by k. Other named columns of the data frame represent any covariates, which may be time-constant or time-dependent. Time-dependent covariates are assumed to be constant between the observation times.

qmatrix

The transition intensity matrix of the Markov process, with any covariates set to zero. The diagonal of qmatrix is ignored, and computed as appropriate so that the rows sum to zero. For example, a possible qmatrix for a three state illness-death model with recovery is:

rbind( c( 0, 0.1, 0.02 ), c( 0.1, 0, 0.01 ), c( 0, 0, 0 ) )

covariates

List of linear covariate effects on log transition intensities. Each element is a vector of the effects of one covariate on all the transition intensities. The intensities are ordered by reading across rows of the intensity matrix, starting with the first, counting the positive off-diagonal elements of the matrix.

For example, for a multi-state model with three transition intensities, and two covariates x and y on each intensity,

covariates=list(x = c(-0.3,-0.3,-0.3), y=c(0.1, 0.1, 0.1))

death

Vector of indices of the death states. A death state is an absorbing state whose time of entry is known exactly, but the individual is assumed to be in an unknown transient state ("alive") at the previous instant. This is the usual situation for times of death in chronic disease monitoring data. For example, if you specify death = c(4, 5) then states 4 and 5 are assumed to be death states.

death = TRUE indicates that the final state is a death state, and death = FALSE (the default) indicates that there is no death state.

start

A vector with the same number of elements as there are distinct subjects in the data, giving the states in which each corresponding individual begins. Or a single number, if all of these are the same. Defaults to state 1 for each subject.

ematrix

An optional misclassification matrix for generating observed states conditionally on the simulated true states. As defined in msm.

misccovariates

Covariate effects on misclassification probabilities via multinomial logistic regression. Linear effects operate on the log of each probability relative to the probability of classification in the correct state. In same format as covariates.

hmodel

An optional hidden Markov model for generating observed outcomes conditionally on the simulated true states. As defined in msm. Multivariate outcomes (hmmMV) are not supported.

hcovariates

List of the same length as hmodel, defining any covariates governing the hidden Markov outcome models. Unlike in the msm function, this should also define the values of the covariate effects. Each element of the list is a named vector of the initial values for each set of covariates for that state. For example, for a three-state hidden Markov model with two, one and no covariates on the state 1, 2 and 3 outcome models respectively,

hcovariates = list (c(acute=-8, age=0), c(acute=-8), NULL)

censor.states

Set of simulated states which should be replaced by a censoring indicator at censoring times. By default this is all transient states (representing alive, with unknown state).

drop.absorb

Drop repeated observations in the absorbing state, retaining only one.

Author

C. H. Jackson chris.jackson@mrc-bsu.cam.ac.uk

Details

sim.msm is called repeatedly to produce a simulated trajectory for each individual. The state at each specified observation time is then taken to produce a new column state. The effect of time-dependent covariates on the transition intensity matrix for an individual is determined by assuming that the covariate is a step function which remains constant in between the individual's observation times. If the subject enters an absorbing state, then only the first observation in that state is kept in the data frame. Rows corresponding to future observations are deleted. The entry times into states given in death are assumed to be known exactly.

Examples

Run this code


### Simulate 100 individuals with common observation times
sim.df <- data.frame(subject = rep(1:100, rep(13,100)), time = rep(seq(0, 24, 2), 100))
qmatrix <- rbind(c(-0.11,   0.1,  0.01 ),
                 c(0.05,   -0.15,  0.1 ),
                 c(0.02,   0.07, -0.09))
simmulti.msm(sim.df, qmatrix)

Run the code above in your browser using DataLab