speffSurv: Semiparametric efficient estimation and testing for a two-sample treatment effect with a right-censored time-to-event endpoint

Description

speffSurv conducts estimation and testing of the treatment effect in a two-group randomized clinical trial with a right-censored time-to-event endpoint. It improves efficiency by leveraging baseline predictors of the endpoint.

Usage

speffSurv(formula, data, force.in=NULL, nvmax=9,
          method=c("exhaustive", "forward", "backward"), 
          optimal=c("cp", "bic", "rsq"), trt.id, 
          conf.level=0.95, fixed=FALSE)

Arguments

formula

a formula object with the response variable on the left of the ~ operator and the linear predictor on the right. The response is a survival object of class Surv. The linear predictor specifies baseline variables that are considered for inclusion by the automated procedure for selecting the best models predicting the endpoint. Interactions and variable transformations might also be considered.

data

a data frame in which to interpret the variables named in the formula and trt.id.

force.in

a vector of indices to columns of the design matrix that should be included in each regression model.

nvmax

the maximum number of covariates considered for inclusion in a model. The default is 9.

method

specifies the type of search technique used in the model selection procedure carried out by the regsubsets function. "exhaustive" (default) performs the all-subsets selection, whereas "forward" and "backward" execute a forward or backward step-wise selection, respectively.

optimal

specifies the optimization criterion for model selection. The default is "cp", Mallow's Cp, which is equivalent to AIC. The other options are "bic" for BIC and "rsq" for R-squared.

trt.id

a character string specifying the name of the treatment indicator which can be a character or a numeric vector. The control and treatment group is defined by the alphanumeric order of labels used in the treatment indicator.

conf.level

the confidence level to be used for confidence intervals reported by summary.speffSurv.

fixed

logical value; if FALSE (default), automated selection procedure is used for predicting the endpoint. Otherwise, all baseline variables specified in the formula are used.

Value

speffSurv returns an object of class "speffSurv" which can be processed by summary.speffSurv to obtain or print a summary of the results. An object of class "speffSurv" is a list containing the following components:

beta

a numeric vector with estimates of the treatment effect from the unadjusted proportional hazards model and the semiparametric efficient model using baseline covariates, respectively.

varbeta

a numeric vector of variance estimates for the treatment effect estimates in beta.

formula

a list with components rndSpace and censSpace containing formula objects for the optimal selected linear regression models that characterize the optimal elements in the randomization and censoring space, respectively. Set to NULL if fixed=TRUE.

fixed

a logical value; if TRUE, the efficient estimator utilizes all baseline covariates specified in the formula. Otherwise, the automated selection procedure is used to identify covariates that ensure optimality.

conf.level

confidence level of the confidence intervals reported by summary.speffSurv.

method

search technique employed in the model selection procedure.

number of subjects in each treatment group.

Details

The treatment effect is represented by the (unadjusted) log hazard ratio for the treatment versus control group. The estimate of the treatment effect using the (unadjusted) proportional hazards model is included in the output.

Using the automated model selection procedure performed by regsubsets, two optimal linear regression models are developed to characterize the influence function of an estimator that is more efficient than the maximum partial likelihood estimator. The "efficient" influence function is searched in the space of influence functions that determine all regular and asymptotically linear estimators for the treatment effect (for definitions see, for example, Tsiatis, 2006). The space of influence functions has three components: the estimation space that characterizes all regular and asymptotically linear estimators that do not use baseline covariates. The other two subspaces, the randomization and censoring space, use baseline covariates to improve the efficiency in the estimation of the treatment effect (Lu, 2008). The automated model selection procedure is used to identify functions in the randomization and censoring space that satisfy a prespecified optimality criterion and that lead to efficiency gain by using baseline predictors of the outcome.

The user has the option to avoid the automated variable selection and, instead, use all variables specified in the formula for the estimation of the treatment effect. This is achieved by setting fixed=TRUE.

speffSurv does not allow missing values in the data.

References

Lu X, Tsiatis AA. (2008), "Improving the efficiency of the log-rank test using auxiliary covariates.", Biometrika, 95:679--694.

Tsiatis AA. (2006), Semiparametric Theory and Missing Data., New York: Springer.

Examples

Run this code

# NOT RUN {
str(ACTG175)

data <- na.omit(ACTG175[ACTG175$arms==0 | ACTG175$arms==1, ])
data <- data[1:100, ]

### efficiency-improved estimation of log hazard ratio using
### baseline covariates
### 'fit1' coerces the use of all specified baseline covariates;
### automated selection procedure is skipped
fit1 <- speffSurv(Surv(days,cens) ~ cd40+cd80+age, 
                  data=data, trt.id="arms", fixed=TRUE)

# }
# NOT RUN {
fit2 <- speffSurv(Surv(days,cens) ~ cd40+cd80+age+wtkg+drugs+karnof+z30+
preanti+symptom, data=data, trt.id="arms")
# }

Run the code above in your browser using DataLab