Learn R Programming

synthpop (version 1.9-0)

syn.survctree: Synthesis of survival time by classification and regression trees (CART)

Description

Generates synthetic event indicator and time to event data using classification and regression trees (without or with bootstrap).

Usage

syn.survctree(y, yevent, x, xp, proper = FALSE, minbucket = 5, ...)

Value

A list with the following components:

syn.time

a vector of length k with synthetic time values.

syn.event

a vector of length k with synthetic event indicator values.

fit

the fitted model which is an item of class ctree.object.

Arguments

y

a vector of length n with original time data.

yevent

a vector of length n with original event indicator data.

x

a matrix (n x p) of original covariates.

xp

a matrix (k x p) of synthesised covariates.

proper

for proper synthesis (proper = TRUE) a CART model is fitted to a bootstrapped sample of the original data.

minbucket

the minimum number of observations in any terminal node. See ctree_control for details.

...

additional parameters passed to ctree.

Details

The procedure for synthesis by a CART model is as follows:

  1. For each xp find the terminal node.

  2. Randomly draw a donor from the members of the node and take the observed value of yevent and y from that draw as the synthetic values.

The function is used in syn() to generate survival times by setting elements of method in syn() to "survctree". Additional parameters related to ctree function, e.g. minbucket can be supplied to syn() as survctree.minbucket.

Where the survival variable is censored this information must be supplied to syn() as a named list (event) that gives the name of the variable for each event indicator. Event variables can be a numeric variable with values 1/0 (1 = event), TRUE/FALSE (TRUE = event) or a factor with 2 levels (level 2 = event). The event variable(s) will be synthesised along with the survival time(s).

See Also

syn, syn.ctree

Examples

Run this code
### This example uses the data set 'mgus2' from the survival package.
### It has a follow-up time variable 'futime' and an event indicator 'death'.
library(survival)

### first exclude the 'id' variable and run a dummy synthesis to get 
### a method vector
ods <- mgus2[-1]
s0 <- syn(ods)

### create new method vector including 'survctree' for 'futime' and create 
### an event list for it; the names of the list element must correspond to 
### the name of the follow-up variable for which the event indicator
### need to be specified.
meth <- s0$method
meth[names(meth) == "futime"] <- "survctree"
evlist <- list(futime = "death")

s1 <- syn(ods, method = meth, event = evlist)

### evaluate outputs
## compare selected variables
compare(s1, ods, vars = c("futime", "death", "sex", "creat"))

## compare original and synthetic follow up time by an event indicator
multi.compare(s1, ods, var = "futime", by = "death")

## compare survival curves for original and synthetic data
par(mfrow = c(2,1))
plot(survfit(Surv(futime, death) ~ sex, data = ods), 
     col = 1:2, xlim = c(0,450), main = "Original data")
legend("topright", levels(ods$sex), col = 1:2, lwd = 1, bty = "n")
plot(survfit(Surv(futime, death) ~ sex, data = s1$syn), 
     col = 1:2, xlim = c(0,450), main = "Synthetic data")

Run the code above in your browser using DataLab