rocJM: Predictive Accuracy Measures for Longitudinal Markers under a Joint Modelling Framework

Description

It computes sensitivity, specificity, ROC and AUC measures for joint models.

Usage

rocJM(object, dt, data, idVar = "id", directionSmaller = NULL, cc = NULL, min.cc = NULL,
  max.cc = NULL, optThr = c("sens*spec", "youden"), 
  diffType = c("absolute", "relative"), abs.diff = 0, rel.diff = 1, 
  M = 300, burn.in = 100, scale = 1.6)

Value

An object of class rocJM is a list with components,

MCresults: a list of length the number of distinct cases in data. Each component of this list is again a list with four components the estimated Sensitivity Sens and its standard error seSens, and the estimated Specificity Spec and its standard error seSpec. All these four components are matrices with rows corresponding to the different dt values and columns corresponding to the different cc values.
AUCs: a numeric vector of estimated areas under the ROC curves for the different values of dt.
optThr: a numeric vector with the optimal threshold values for the markers for the different dt under the choice made in argument optThr.
times: a list of length the number of distinct cases in data with components numeric vectors of the time points at which longitudinal measurements are supposed to be taken.
dt: a copy of the dt argument.
M: a copy of the M argument.
diffType: a copy of the diffType argument.
abs.diff: a copy of the abs.diff argument.
rel.diff: a copy of the rel.diff argument.
cc: a copy of the cc argument.
min.cc: a copy of the min.cc argument.
max.cc: a copy of the max.cc argument.
success.rate: a numeric matrix with the success rates of the Metropolis-Hastings algorithm described above.

Arguments

object: an object inheriting from class jointModel.
dt: a numeric vector indicating the lengths of the time intervals of primary interest within which we want to distinguish between subjects who died within the intervals from subjects who survived longer than that.
data: a data frame that contains the baseline covariates for the longitudinal and survival submodels, including a case identifier (i.e., the variable denoted by the argument idVar), the time points on which longitudinal measurements are assumed to be taken (this should have the same name as in the argument timeVar of jointModel).
idVar: the name of the variable in data that identifies the different generic subjects to be considered.
directionSmaller: logical; if TRUE, then smaller values for the longitudinal outcome are associated with higher risk for an event.
cc: a numeric vector of threshold values for the longitudinal marker; if NULL, this is computed using a regular sequence based on percentiles of the observed marker values.
min.cc: the start of the regular sequence for the threshold values for the longitudinal marker; see argument cc above.
max.cc: the end of the regular sequence for the threshold values for the longitudinal marker; see argument cc above.
optThr: character string defining how the optimal threshold is to be computed. The default chooses the cut-point for the marker that maximizes the product of sensitivity and specificity. Option "youden" chooses the cut-point that maximizes Youden's index that equals sensitivity + specificity - 1.
diffType: character string defining the type of prediction rule. See Details.
abs.diff: a numeric vector of absolute differences in the definition of composite prediction rules.
rel.diff: a numeric vector of relative differences in the definition of composite prediction rules.
M: a numeric scalar denoting the number of Monte Carlo samples.
burn.in: a numeric scalar denoting the iterations to discard.
scale: a numeric scalar that controls the acceptance rate of the Metropolis-Hastings algorithm. See Details.

Author

Dimitris Rizopoulos d.rizopoulos@erasmusmc.nl

Details

(Note: the following contain some math formulas, which are better viewed in the pdf version of the manual accessible at https://cran.r-project.org/package=JM.)

Assume that we have collected longitudinal measurements $Y_i(t) = \{y_i(s); 0 \leq s \leq t\}$ up to time point $t$ for subject $i$. We are interested in events occurring in the medically relevant time frame $(t, t + \Delta t]$ within which the physician can take an action to improve the survival chance of the patient. Using an appropriate function of the marker history $Y_i(t)$, we can define a prediction rule to discriminate between patients of high and low risk for an event. For instance, for in HIV infected patients, we could consider values of CD4 cell count smaller than a specific threshold as predictive for death. Since we are in a longitudinal context, we have the flexibility of determining which values of the longitudinal history of the patient will contribute to the specification of the prediction rule. That is, we could define a composite prediction rule that is not based only on the last available measurement but rather on the last two or last three measurements of a patient. Furthermore, it could be of relevance to consider different threshold values for each of these measurements, for instance, we could define as success the event that the pre-last CD4 cell count is $c$ and the last one $0.5c$, indicating that a 50% decrease is strongly indicative for death. Under this setting we define sensitivity and specificity as, $$Pr \bigl \{ {\cal S}_i(t, k, c) \mid T_i^* > t, T_i^* \in (t, t + \Delta t] \bigr \},$$ and $$Pr \bigl \{ {\cal F}_i(t, k, c) \mid T_i^* > t, T_i^* > t + \Delta t \bigr \},$$ respectively, where we term ${\cal S}_i(t, k, c) = \{y_i(s) \leq c_s; k \leq s \leq t\}$ as success (i.e., occurrence of the event of interest), and ${\cal F}_i(t, k, c) = \{y_i(s) > c_s; k \leq s \leq t\}$ as a failure, $T_i^*$ denotes the time-to-event, and $\Delta t$ the length of the medically relevant time window (specified by argument dt). The cut values for the marker $c$ are specified by the cc, min.cc and max.cc arguments. Two types of composite prediction rules can be defined depending on the value of the diffType argument. Absolute prediction rules in which, between successive measurements there is an absolute difference of between the cut values, and relative prediction rules in which there is a relative difference between successive measurements of the marker. The lag values for these differences are defined by the abs.diff and rel.diff arguments. Some illustrative examples:

Ex1:: keeping the defaults we define a simple rule that is only based on the last available marker measurement.
Ex2:: to define a prediction rule that is based on the last two available measurements using the same cut values (e.g., if a patient had two successive measurements below a medically relevant threshold), we need to set abs.diff = c(0, 0).
Ex3:: to define a prediction rule that is based on the last two available measurements using a drop of 5 units between the cut values (e.g., the pre-last measurement is $c$ and the last $c-5$), we need to set abs.diff = c(0, -5).
Ex4:: to define a prediction rule that is based on the last two available measurements using a drop of 20% units between the cut values (e.g., the pre-last measurement is $c$ and the last $0.8c$), we need to set diffType = "relative" and rel.diff = c(0, 0.8).

The estimation of the above defined probabilities is achieved with a Monte Carlo scheme similar to the one described in survfitJM. The number of Monte Carlo samples is defined by the M argument, and the burn-in iterations for the Metropolis-Hastings algorithm using the burn.in argument.

More details can be found in Rizopoulos (2011).

References

Heagerty, P. and Zheng, Y. (2005). Survival model predictive accuracy and ROC curves. Biometrics 61, 92--105.

Rizopoulos, D. (2012) Joint Models for Longitudinal and Time-to-Event Data: with Applications in R. Boca Raton: Chapman and Hall/CRC.

Rizopoulos, D. (2011). Dynamic predictions and prospective accuracy in joint models for longitudinal and time-to-event data. Biometrics 67, 819--829.

Rizopoulos, D. (2010) JM: An R package for the joint modelling of longitudinal and time-to-event data. Journal of Statistical Software 35 (9), 1--33. tools:::Rd_expr_doi("10.18637/jss.v035.i09")

Zheng, Y. and Heagerty, P. (2007). Prospective accuracy for longitudinal markers. Biometrics 63, 332--341.

Examples

Run this code

if (FALSE) {
fitLME <- lme(sqrt(CD4) ~ obstime * (drug + AZT + prevOI + gender), 
    random = ~ obstime | patient, data = aids)
fitSURV <- coxph(Surv(Time, death) ~ drug + AZT + prevOI + gender, 
    data = aids.id, x = TRUE)
fit.aids <- jointModel(fitLME, fitSURV, timeVar = "obstime", 
    method = "piecewise-PH-aGH")

# the following will take some time to execute...
ND <- aids[aids$patient == "7", ]
roc <- rocJM(fit.aids, dt = c(2, 4, 8), ND, idVar = "patient")
roc
}

Run the code above in your browser using DataLab