ate: Compute the Average Treatment Effects Via

Description

Use the g-formula/IPTW/double robust estimator to estimate the average treatment effect based on Cox regression with or without competing risks.

Usage

ate(
  event,
  treatment,
  censor = NULL,
  data,
  formula,
  estimator = NULL,
  strata = NULL,
  contrasts = NULL,
  times,
  cause = NA,
  landmark,
  se = TRUE,
  iid = FALSE,
  known.nuisance = FALSE,
  band = FALSE,
  B = 0,
  seed,
  handler = "foreach",
  mc.cores = 1,
  cl = NULL,
  verbose = TRUE,
  ...
)

Arguments

event

Outcome model which describes how event risk depends on treatment and covariates. The object carry its own call and have a predictRisk method. See examples.

treatment

Treatment model which describes how treatment depends on covariates. The object must be a glm object (logistic regression) or the name of the treatment variable. See examples.

censor

Censoring model which describes how censoring depends on treatment and covariates. The object must be a coxph or cph object. See examples.

data

[data.frame or data.table] Data set in which to evaluate risk predictions based on the outcome model

formula

For analyses with time-dependent covariates, the response formula. See examples.

estimator

[character] The type of estimator used to compute the average treatment effect. Can be "G-formula", "IPTW", or "AIPTW". When using estimator="G-formula", a model for the outcome should be provided (argument event). When using estimator="IPTW", a model for the treatment should be provided (argument treatment), as well as for the censoring (if any, argument censor). When using estimator="AIPTW" (double robust estimator), a model for the outcome and the treatment should be provided (argument event and treatment), as well as for the censoring (if any, argument censor).

strata

[character] Strata variable on which to compute the average risk. Incompatible with treatment. Experimental.

contrasts

[character] The levels of the treatment variable to be compared.

times

[numeric vector] Time points at which to evaluate average treatment effects.

cause

[integer/character] the cause of interest.

landmark

for models with time-dependent covariates the landmark time(s) of evaluation. In this case, argument time may only be one value and for the prediction of risks it is assumed that that the covariates do not change between landmark and landmark+time.

[logical] If TRUE compute and add the standard errors to the output.

iid

[logical] If TRUE compute and add the influence function to the output.

known.nuisance

[logical] If FALSE the uncertainty related to the estimation of the nuisance parameters is ignored. This greatly simplifies computations but requires to use a double robust estimator and to assumes that all event, treatment, and censoring models are valid to obtain consistent standard errors.

band

[logical] If TRUE compute and add the quantiles for the confidence bands to the output.

[integer, >0] the number of bootstrap replications used to compute the confidence intervals. If it equals 0, then the influence function is used to compute Wald-type confidence intervals/bands.

seed

[integer, >0] sed number used to generate seeds for bootstrap and to achieve reproducible results.

handler

[character] Parallel handler for bootstrap. either "foreach", "mclapply", "snow" or "multicore". if "foreach" use doparallel to create a cluster.

mc.cores

[integer, >0] The number of cores to use, i.e., the upper limit for the number of child processes that run simultaneously. Passed to parallel::mclapply or doparallel::registerdoparallel. The option is initialized from environment variable mc_cores if set.

A parallel socket cluster used to perform cluster calculation in parallel. Output by parallel::makeCluster. The packages necessary to run the computations (e.g. riskRegression) must already be loaded on each worker.

verbose

[logical] If TRUE inform about estimated run time. "minimal" requires less memory but can only estimate the standard for the difference between treatment effects (and not for the ratio).

...

passed to predictRisk

Examples

Run this code

# NOT RUN {
library(survival)
library(rms)
library(prodlim)
set.seed(10)

#### Survival settings  ####
#### ATE with Cox model ####

## generate data
n <- 100
dtS <- sampleData(n, outcome="survival")
dtS$time <- round(dtS$time,1)
dtS$X1 <- factor(rbinom(n, prob = c(0.3,0.4) , size = 2), labels = paste0("T",0:2))

## estimate the Cox model
fit <- cph(formula = Surv(time,event)~ X1+X2,data=dtS,y=TRUE,x=TRUE)

## compute the ATE at times 5, 6, 7, and 8 using X1 as the treatment variable
# }
# NOT RUN {
## only point estimate (argument se = FALSE)
ateFit1a <- ate(fit, data = dtS, treatment = "X1", times = 5:8,
               se = FALSE)

## standard error / confidence intervals computed using the influence function
## (argument se = TRUE and B = 0)
ateFit1b <- ate(fit, data = dtS, treatment = "X1", times = 5:8,
               se = TRUE, B = 0)

## same as before with in addition the confidence bands for the ATE
## (argument band = TRUE)
ateFit1c <- ate(fit, data = dtS, treatment = "X1", times = 5:8,
               se = TRUE, band = TRUE, B = 0)

## standard error / confidence intervals computed using 100 boostrap samples
## (argument se = TRUE and B = 100) 
ateFit1d <- ate(fit, data = dtS, treatment = "X1",
                times = 5:8, se = TRUE, B = 100)
## NOTE: for real applications 100 bootstrap samples is not enougth 

## same but using 2 cpus for generating and analyzing the boostrap samples
## (parallel computation, argument mc.cores = 2) 
ateFit1e <- ate(fit, data = dtS, treatment = "X1",
                times = 5:8, se = TRUE, B = 100, mc.cores = 2)
# }
# NOT RUN {
#### Survival settings without censoring ####
#### ATE with glm                        ####

## generate data
n <- 100
dtB <- sampleData(n, outcome="binary")
dtB[, X2 := as.numeric(X2)]

## estimate a logistic regression model
fit <- glm(formula = Y ~ X1+X2, data=dtB, family = "binomial")

## compute the ATE using X1 as the treatment variable
## only point estimate (argument se = FALSE)
ateFit1a <- ate(fit, data = dtB, treatment = "X1", se = FALSE)

# }
# NOT RUN {
## standard error / confidence intervals computed using the influence function
ateFit1b <- ate(fit, data = dtB, treatment = "X1",
               times = 5, ## just for having a nice output not used in computations
               se = TRUE, B = 0)

## standard error / confidence intervals computed using 100 boostrap samples
ateFit1d <- ate(fit, data = dtB, treatment = "X1",
                times = 5, se = TRUE, B = 100)

## using the lava package
ateLava <- estimate(fit, function(p, data){
a <- p["(Intercept)"] ; b <- p["X11"] ; c <- p["X2"] ;
R.X11 <- expit(a + b + c * data[["X2"]])
R.X10 <- expit(a + c * data[["X2"]])
list(risk0=R.X10,risk1=R.X11,riskdiff=R.X11-R.X10)},
average=TRUE)
ateLava

ateFit1b$meanRisk
# }
# NOT RUN {
#### Competing risks settings               ####
#### ATE with cause specific Cox regression ####

# }
# NOT RUN {
## generate data
n <- 500
set.seed(10)
dt <- sampleData(n, outcome="competing.risks")
dt$time <- round(dt$time,1)
dt$X1 <- factor(rbinom(n, prob = c(0.2,0.3) , size = 2), labels = paste0("T",0:2))

## estimate cause specific Cox model
fitCR <-  CSC(Hist(time,event)~ X1+X8,data=dt,cause=1)

## compute the ATE at times 10, 15, 20 using X1 as the treatment variable
ateFit2a <- ate(fitCR, data = dt, treatment = "X1", times = c(10,15,20),
                cause = 1, se = FALSE)

## standard error / confidence intervals computed using the influence function
## (argument se = TRUE and B = 0)
ateFit2b <- ate(fitCR, data = dt, treatment = "X1", times = c(10,15,20),
                cause = 1, se = TRUE, B = 0)

## same as before with in addition the confidence bands for the ATE
## (argument band = TRUE)
ateFit2c <- ate(fitCR, data = dt, treatment = "X1", times = c(10,15,20), 
               cause = 1, se = TRUE, band = TRUE, B = 0)

## standard error / confidence intervals computed using 100 boostrap samples
## (argument se = TRUE and B = 100) 
ateFit2d <- ate(fitCR, data = dt, treatment = "X1", times = c(10,15,20), 
                cause = 1, se = TRUE, B = 100)
## NOTE: for real applications 100 bootstrap samples is not enougth 

## same but using 2 cpus for generating and analyzing the boostrap samples
## (parallel computation, argument mc.cores = 2) 
ateFit2e <- ate(fitCR, data = dt, treatment = "X1", times = c(10,15,20), 
                cause = 1, se = TRUE, B = 100, mc.cores = 2)
# }
# NOT RUN {
#### time-dependent covariates ###
# }
# NOT RUN {
library(survival)
fit <- coxph(Surv(time, status) ~ celltype+karno + age + trt, veteran)
vet2 <- survSplit(Surv(time, status) ~., veteran,
                       cut=c(60, 120), episode ="timegroup")
fitTD <- coxph(Surv(tstart, time, status) ~ celltype+karno + age + trt,
               data= vet2,x=1)
set.seed(16)
resVet <- ate(fitTD,formula=Hist(entry=tstart,time=time,event=status)~1,
          data = vet2, treatment = "celltype", contrasts = NULL,
        times=5,verbose=1,
        landmark = c(0,30,60,90), cause = 1, B = 10, se = 1,
        band = FALSE, mc.cores=1)
resVet
# }
# NOT RUN {
# }
# NOT RUN {
set.seed(137)
d=sampleDataTD(127)
library(survival)
d[,status:=1*(event==1)]
d[,X3:=as.factor(X3)]
## ignore competing risks
cox1TD <- coxph(Surv(start,time, status,type="counting") ~ X3+X5+X6+X8,
                data=d, x = TRUE)
resTD1 <- ate(cox1TD,formula=Hist(entry=start,time=time,event=status)~1,
        data = d, treatment = "X3", contrasts = NULL,
        times=.5,verbose=1,
        landmark = c(0,0.5,1), B = 20, se = 1,
        band = FALSE, mc.cores=1)
resTD1
## account for competing risks
cscTD <- CSC(Hist(time=time, event=event,entry=start) ~ X3+X5+X6+X8, data=d)
set.seed(16)
resTD <- ate(cscTD,formula=Hist(entry=start,time=time,event=event)~1,
        data = d, treatment = "X3", contrasts = NULL,
        times=.5,verbose=1,
        landmark = c(0,0.5,1), cause = 1, B = 20, se = 1,
        band = FALSE, mc.cores=1)
resTD
# }

Run the code above in your browser using DataLab

Description

Usage

Arguments

See Also

Examples