rocJM: Predictive Accuracy Measures for Longitudinal Markers under a Joint Modelling Framework

Description

It computes sensitivity, specificity, ROC and AUC measures for joint models.

Usage

rocJM(object, dt, data, idVar = "id", cc = NULL, min.cc = NULL,
  max.cc = NULL, diffType = c("absolute", "relative"), 
  abs.diff = 0, rel.diff = 1, M = 300, burn.in = 100, scale = 1.6)

Arguments

object

an object inheriting from class jointModel.

a numeric vector indicating the lengths of the time intervals of primary interest within which we want to distinguish between subjects who died within the intervals from subjects who survived longer than that.

data

a data frame that contains the baseline covariates for the longitudinal and survival submodels, including a case identifier (i.e., the variable denoted by the argument idVar), the time points on which longitudinal measuremen

idVar

the name of the variable in data that identifies the different generic subjects to be considered.

a numeric vector of threshold values for the longitudinal marker; if NULL, this is computed using a regular sequence based on percentiles of the observed marker values.

min.cc

the start of the regular sequence for the threshold values for the longitudinal marker; see argument cc above.

max.cc

the end of the regular sequence for the threshold values for the longitudinal marker; see argument cc above.

diffType

character string defining the type of prediction rule. See Details.

abs.diff

a numeric vector of absolute differences in the definition of composite prediction rules.

rel.diff

a numeric vector of relative differences in the definition of composite prediction rules.

a numeric scalar denoting the number of Monte Carlo samples.

burn.in

a numeric scalar denoting the iterations to discard.

scale

a numeric scalar that controls the acceptance rate of the Metropolis-Hastings algorithm. See Details.

Value

An object of class rocJM is a list with components,
MCresultsa list of length the number of distinct cases in data. Each component of this list is again a list with four components the estimated Sensitivity Sens and its standard error seSens, and the estimated Specificity Spec and its standard error seSpec. All these four components are matrices with rows corresponding to the different dt values and columns corresponding to the different cc values.
AUCsa numeric vector of estimated areas under the ROC curves for the different values of dt.
optThra numeric vector of matrix with the optimal threshold values for the markers for the different dt. These are defined as the values that maximize the product of sensitivity and specificity (a simple but not always optimal rule!).
timesa list of length the number of distinct cases in data with components numeric vectors of the time points at which longitudinal measurements are supposed to be taken.
dta copy of the dt argument.
Ma copy of the M argument.
diffTypea copy of the diffType argument.
abs.diffa copy of the abs.diff argument.
rel.diffa copy of the rel.diff argument.
cca copy of the cc argument.
min.cca copy of the min.cc argument.
max.cca copy of the max.cc argument.
success.ratea numeric matrix with the success rates of the Metropolis-Hastings algorithm described above.

Details

(Note: the following contain some math formulas, which are better viewed in the pdf version of the manual accessible at http://cran.r-project.org/package=JM.) Assume that we have collected longitudinal measurements $Y_i(t) = {y_i(s); 0 \leq s \leq t}$ up to time point $t$ for subject $i$. We are interested in events occurring in the medically relevant time frame $(t, t + \Delta t]$ within which the physician can take an action to improve the survival chance of the patient. Using an appropriate function of the marker history $Y_i(t)$, we can define a prediction rule to discriminate between patients of high and low risk for an event. For instance, for in HIV infected patients, we could consider values of CD4 cell count smaller than a specific threshold as predictive for death. Since we are in a longitudinal context, we have the flexibility of determining which values of the longitudinal history of the patient will contribute to the specification of the prediction rule. That is, we could define a composite prediction rule that is not based only on the last available measurement but rather on the last two or last three measurements of a patient. Furthermore, it could be of relevance to consider different threshold values for each of these measurements, for instance, we could define as success the event that the pre-last CD4 cell count is $c$ and the last one $0.5c$, indicating that a 50% decrease is strongly indicative for death. Under this setting we define sensitivity and specificity as, $$Pr \bigl { {\cal S}_i(t, k, c) \mid T_i^* > t, T_i^* \in (t, t + \Delta t] \bigr },$$ and $$Pr \bigl { {\cal F}_i(t, k, c) \mid T_i^* > t, T_i^* > t + \Delta t \bigr },$$ respectively, where we term ${\cal S}_i(t, k, c) = {y_i(s) \leq c_s; k \leq s \leq t}$ as success (i.e., occurrence of the event of interest), and ${\cal F}_i(t, k, c) = {y_i(s) > c_s; k \leq s \leq t}$ as a failure, $T_i^*$ denotes the time-to-event, and $\Delta t$ the length of the medically relevant time window (specified by argument dt). The cut values for the marker $c$ are specified by the cc, min.cc and max.cc arguments. Two types of composite prediction rules can be defined depending on the value of the diffType argument. Absolute prediction rules in which, between successive measurements there is an absolute difference of between the cut values, and relative prediction rules in which there is a relative difference between successive measurements of the marker. The lag values for these differences are defined by the abs.diff and rel.diff arguments. Some illustrative examples: [object Object],[object Object],[object Object],[object Object] The estimation of the above defined probabilities is achieved with a Monte Carlo scheme similar to the one described in survfitJM. The number of Monte Carlo samples is defined by the M argument, and the burn-in iterations for the Metropolis-Hastings algorithm using the burn.in argument. More details can be found in Rizopoulos (2010a).

References

Heagerty, P. and Zheng, Y. (2005). Survival model predictive accuracy and ROC curves. Biometrics 61, 92--105. Rizopoulos, D. (2010a). Dynamic predictions and prospective accuracy in joint models for longitudinal and time-to-event data. Biometrics, accepted. Rizopoulos, D. (2010b) JM: An R package for the joint modelling of longitudinal and time-to-event data. Journal of Statistical Software 35 (9), 1--33. http://www.jstatsoft.org/v35/i09/ Zheng, Y. and Heagerty, P. (2007). Prospective accuracy for longitudinal markers. Biometrics 63, 332--341.

Examples

Run this code

fitLME <- lme(sqrt(CD4) ~ obstime * (drug + AZT + prevOI + gender), 
    random = ~ obstime | patient, data = aids)
fitSURV <- coxph(Surv(Time, death) ~ drug + AZT + prevOI + gender, 
    data = aids.id, x = TRUE)
fit.aids <- jointModel(fitLME, fitSURV, timeVar = "obstime", 
    method = "piecewise-PH-GH")

# the following will take some time to execute...
ND <- aids[aids$patient == "7", ]
roc <- rocJM(fit.aids, dt = c(2, 4, 8), ND, idVar = "patient")
roc

Run the code above in your browser using DataLab