smcfcs.finegray: Substantive model compatible fully conditional specification imputation of covariates for a Fine-Gray model

Description

Multiply imputes missing covariate values using substantive model compatible fully conditional specification for competing risks outcomes, when the substantive model is a Fine-Gray model for the subdistribution hazard of one event.

Usage

smcfcs.finegray(
  originaldata,
  smformula,
  method,
  cause = 1,
  m = 5,
  numit = 10,
  rjlimit = 5000,
  kmi_args = list(formula = ~1, bootstrap = FALSE, nboot = 10),
  ...
)

Value

An object of type "smcfcs", as would usually be returned from smcfcs.

Arguments

originaldata: The original data frame with missing values.
smformula: The formula of the substantive model, given as a string. Needs to be of the form "Surv(t, d) ~ x1 + x2", where t is a vector of competing event times, and d is a (numeric) competing event indicator, where 0 must designate a censored observation.
method: A required vector of strings specifying for each variable either that it does not need to be imputed (""), the type of regression model to be be used to impute. Possible values are "norm" (normal linear regression), "logreg" (logistic regression), "brlogreg" (bias reduced logistic regression), "poisson" (Poisson regression), "podds" (proportional odds regression for ordered categorical variables), "mlogit" (multinomial logistic regression for unordered categorical variables), or a custom expression which defines a passively imputed variable, e.g. "x^2" or "x1*x2". "latnorm" indicates the variable is a latent normal variable which is measured with error. If this is specified for a variable, the "errorProneMatrix" argument should also be used.
cause: Numeric, designating the competing event of interest (default is `cause = 1`).
m: The number of imputed datasets to generate. The default is 5.
numit: The number of iterations to run when generating each imputation. In a (limited) range of simulations good performance was obtained with the default of 10 iterations. However, particularly when the proportion of missingness is large, more iterations may be required for convergence to stationarity.
rjlimit: Specifies the maximum number of attempts which should be made when using rejection sampling to draw from imputation models. If the limit is reached when running a warning will be issued. In this case it is probably advisable to increase the rjlimit until the warning does not appear.
kmi_args: List, containing arguments to be passed on to kmi. The "formula" element is a formula where the right-hand side specifies the covariates used for multiply imputing the potential censoring times for individual's failing from competing events. The default is `formula = ~ 1`, which uses marginal Kaplan-Meier estimator of the censoring distribution.
...: Additional arguments to pass on to smcfcs

Author

Edouard F. Bonneville e.f.bonneville@lumc.nl

Details

In the presence of random right censoring, the function first multiply imputes the potential censoring times for those failing from competing events using kmi, and thereafter uses smcfcs to impute the missing covariates. See Bonneville et al. 2024 for further details on the methodology.

The function does not (yet) support parallel computation.

References

Bonneville EF, Beyersmann J, Keogh RH, Bartlett JW, Morris TP, Polverelli N, de Wreede LC, Putter H. Multiple imputation of missing covariates when using the Fine--Gray model. 2024. Submitted.

Examples

Run this code

if (FALSE) {
library(survival)
library(kmi)

imps <- smcfcs.finegray(
  originaldata = ex_finegray,
  smformula = "Surv(times, d) ~ x1 + x2",
  method = c("", "", "logreg", "norm"),
  cause = 1,
  kmi_args = list("formula" = ~ 1)
)

if (requireNamespace("mitools", quietly = TRUE)) {
  library(mitools)
  impobj <- imputationList(imps$impDatasets)
  # Important: use Surv(newtimes, newevent) ~ ... when pooling
  # (respectively: subdistribution time and indicator for cause of interest)
  models <- with(impobj, coxph(Surv(newtimes, newevent) ~ x1 + x2))
  summary(MIcombine(models))
}
}

Run the code above in your browser using DataLab