Learn R Programming

JM (version 1.4-8)

prederrJM: Prediction Errors for Joint Models

Description

Using the available longitudinal information up to a starting time point, this function computes an estimate of the prediction error of survival at a horizon time point based on joint models.

Usage

prederrJM(object, newdata, Tstart, Thoriz, …)

# S3 method for jointModel prederrJM(object, newdata, Tstart, Thoriz, lossFun = c("absolute", "square"), interval = FALSE, idVar = "id", simulate = FALSE, M = 100, …)

Arguments

object

an object inheriting from class jointModel.

newdata

a data frame that contains the longitudinal and covariate information for the subjects for which prediction of survival probabilities is required. The names of the variables in this data frame must be the same as in the data frames that were used to fit the linear mixed effects model (using lme()) and the survival model (using coxph()) that were supplied as the two first argument of jointModel. In addition, this data frame should contain a variable that identifies the different subjects (see also argument idVar).

Tstart

numeric scalar denoting the time point up to which longitudinal information is to be used to derive predictions.

Thoriz

numeric scalar denoting the time point for which a prediction of the survival status is of interest; Thoriz mast be later than Tstart.

lossFun

either the options "absolute" (default) or "square", or a user-specified loss function. As the names suggest, when lossFun = "absolute" the loss function is \(L(x) = |x|\), whereas when lossFun = "square" the loss function is \(L(x) = x^2\). If a user-specified function is supplied, this should have a single argument and be vectorized.

interval

logical; if TRUE the weighted prediction error in the interval [Tstart, Thoriz] is calculated, while if FALSE the prediction error at time Thoriz is calculated using the longitudinal information up to time Tstart.

idVar

the name of the variable in newdata that identifies the different subjects.

simulate

logical; if TRUE, a Monte Carlo approach is used to estimate survival probabilities. If FALSE, a first order estimator is used instead. See survfitJM for mote details.

M

a numeric scalar denoting the number of Monte Carlo samples; see survfitJM for mote details.

additional arguments; currently none is used.

Value

A list of class prederrJM with components:

prederr

a numeric scalar denoting the estimated prediction error.

nr

a numeric scalar denoting the number of subjects at risk at time Tstart.

Tstart

a copy of the Tstart argument.

Thoriz

a copy of the Thoriz argument.

interval

a copy of the interval argument.

classObject

the class of object.

nameObject

the name of object.

lossFun

a copy of the lossFun argument.

Details

Based on a fitted joint model (represented by object) and using the data supplied in argument newdata, this function computes the following estimate of the prediction: $$PE(u | t) = \{R(t)\}^{-1} \sum_{i: T_i \geq s} I(T_i \geq u) L\{1 - Pr(T_i > u | T_i > t, \tilde{y}_i(t), x_i)\}$$ $$+ \delta_i I(T_i < u) L\{0 - Pr(T_i > u | T_i > t, \tilde{y}_i(t), x_i)\}$$ $$+ (1 - \delta_i) I(T_i < u) [S_i(u \mid T_i, \tilde{y}_i(t)) L\{1 - Pr(T_i > u | T_i > t, \tilde{y}_i(t), x_i)\}$$ $$+ \{1 - S_i(u \mid T_i, \tilde{y}_i(t))\} L\{0 - Pr(T_i > u | T_i > t, \tilde{y}_i(t), x_i)\}],$$ where \(R(t)\) denotes the number of subjects at risk at time \(t = \) Tstart, \(\tilde{y}_i(t) = \{y_i(s), 0 \leq s \leq t\}\) denotes the available longitudinal measurements up to time \(t\), \(T_i\) denotes the observed event time for subject \(i\), \(\delta_i\) is the event indicator, \(s\) is the starting time point Tstart up to which the longitudinal information is used, and \(u > s\) is the horizon time point Thoriz. Function \(L(.)\) is the loss function that can be the absolute value (i.e., \(L(x) = |x|\)), the squared value (i.e., \(L(x) = x^2\)), or a user-specified function. The probabilities \(Pr(T_i > u | T_i > t, \tilde{y}_i(t), x_i)\) are calculated by survfitJM.

When interval is set to TRUE, then function prederrJM computes the integrated prediction error in the interval \((u,t) =\) (Tstart, Thoriz) defined as $$IPE(u | t) = \sum_{i: t \leq T_i \leq u} w_i(T_i) PE(T_i | t),$$ where $$w_i(T_i) = \frac{\delta_i G(T_i) / G(t)}{\sum_{i: t \leq T_i \leq u} \delta_i G(T_i) / G(t)},$$ with \(G(.)\) denoting the Kaplan-Meier estimator of the censoring time distribution.

References

Henderson, R., Diggle, P. and Dobson, A. (2002). Identification and efficacy of longitudinal markers for survival. Biostatistics 3, 33--50.

Rizopoulos, D. (2012) Joint Models for Longitudinal and Time-to-Event Data: with Applications in R. Boca Raton: Chapman and Hall/CRC.

Rizopoulos, D. (2011). Dynamic predictions and prospective accuracy in joint models for longitudinal and time-to-event data. Biometrics 67, 819--829.

Rizopoulos, D., Murawska, M., Andrinopoulou, E.-R., Lesaffre, E. and Takkenberg, J. (2013). Dynamic predictions with time-dependent covariates in survival analysis: A comparison between joint modeling and landmarking. under preparation.

See Also

survfitJM, aucJM, dynCJM, jointModel

Examples

Run this code
# NOT RUN {
# we construct the composite event indicator (transplantation or death)
pbc2$status2 <- as.numeric(pbc2$status != "alive")
pbc2.id$status2 <- as.numeric(pbc2.id$status != "alive")

# we fit the joint model using splines for the subject-specific 
# longitudinal trajectories and a spline-approximated baseline
# risk function
lmeFit <- lme(log(serBilir) ~ ns(year, 3),
    random = list(id = pdDiag(form = ~ ns(year, 3))), data = pbc2)
survFit <- coxph(Surv(years, status2) ~ drug, data = pbc2.id, x = TRUE)
jointFit <- jointModel(lmeFit, survFit, timeVar = "year", 
    method = "piecewise-PH-aGH")

# prediction error at year 10 using longitudinal data up to year 5 
prederrJM(jointFit, pbc2, Tstart = 5, Thoriz = 10)
prederrJM(jointFit, pbc2, Tstart = 5, Thoriz = 6.5, interval = TRUE)
# }

Run the code above in your browser using DataLab