Learn R Programming

smcfcs (version 2.0.0)

smcfcs.dtsam: Substantive model compatible fully conditional specification imputation of covariates for discrete time survival analysis

Description

Multiply imputes missing covariate values using substantive model compatible fully conditional specification for discrete time survival analysis.

Usage

smcfcs.dtsam(originaldata, smformula, method, timeEffects = "factor", ...)

Arguments

originaldata

The data in wide form (i.e. one row per subject)

smformula

A formula of the form "Surv(t,d)~x1+x2+x3", where t is the discrete time variable, d is the binary event indicator, and the covariates should not include time. The time variable should be an integer coded numeric variable taking values from 1 up to the final time period.

method

A required vector of strings specifying for each variable either that it does not need to be imputed (""), the type of regression model to be be used to impute. Possible values are "norm" (normal linear regression), "logreg" (logistic regression), "brlogreg" (bias reduced logistic regression), "poisson" (Poisson regression), "podds" (proportional odds regression for ordered categorical variables), "mlogit" (multinomial logistic regression for unordered categorical variables), or a custom expression which defines a passively imputed variable, e.g. "x^2" or "x1*x2". "latnorm" indicates the variable is a latent normal variable which is measured with error. If this is specified for a variable, the "errorProneMatrix" argument should also be used.

timeEffects

Specifies how the effect of time is modelled. timeEffects="factor" (the default) models time as a factor variable. timeEffects="linear" and timeEffects="quad" specify that time be modelled as a continuous linear or quadratic effect on the log odds scale respectively.

...

Additional arguments to pass on to smcfcs

Author

Jonathan Bartlett jonathan.bartlett1@lshtm.ac.uk

Details

For this substantive model type, like for the other substantive model types, smcfcs expects the originaldata to have one row per subject. Variables indicating the discrete time of failure/censoring and the event indicator should be passed in smformula, as described.

The default is to model the effect of time as a factor. This will not work in datasets where there is not at least one observed event in each time period. In such cases you must specify a simpler parametric model for the effect of time. At the moment you can specify either a linear or quadratic effect of time (on the log odds scale).

Examples

Run this code
#the following example is not run when the package is compiled on CRAN
#(to keep computation time down), but it can be run by package users
if (FALSE) {
  #discrete time survival analysis example
  M <- 5
  imps <- smcfcs.dtsam(ex_dtsam, "Surv(failtime,d)~x1+x2",
                 method=c("logreg","", "", ""),m=M)
  #fit dtsam model to each dataset manually, since we need
  #to expand to person-period data form first
  ests <- vector(mode = "list", length = M)
  vars <- vector(mode = "list", length = M)
  for (i in 1:M) {
    longData <- survSplit(Surv(failtime,d)~x1+x2, data=imps$impDatasets[[i]],
                          cut=unique(ex_dtsam$failtime[ex_dtsam$d==1]))
    mod <- glm(d~-1+factor(tstart)+x1+x2, family="binomial", data=longData)
    ests[[i]] <- coef(mod)
    vars[[i]] <- diag(vcov(mod))
  }
  library(mitools)
  summary(MIcombine(ests,vars))
}

Run the code above in your browser using DataLab