SurvTest: Independent Two- and K-Sample Tests for Censored Data

Description

Testing the equality of survival distributions in two or more independent groups.

Usage

## S3 method for class 'formula':
surv_test(formula, data, subset = NULL,  
    weights = NULL, \dots)
## S3 method for class 'IndependenceProblem':
surv_test(object, 
    ties.method = c("logrank", "HL", "average-scores"), ...)

Arguments

formula

a formula of the form Surv(time, event) ~ x | block where time is a positive numeric variable denoting the survival time and event is a logical being TRUE when the event of interest was obse

data

an optional data frame containing the variables in the model formula.

subset

an optional vector specifying a subset of observations to be used.

weights

an optional formula of the form ~ w defining integer valued weights for the observations.

object

an object of class IndependenceProblem.

ties.method

a character specifying the way ties are handled in the definition of the logrank scores, see below.

...

further arguments to be passed to or from methods.

Value

An object inheriting from class IndependenceTest-class with methods show, statistic, expectation, covariance and pvalue. The null distribution can be inspected by pperm, dperm, qperm and support methods.

Details

The null hypothesis of the equality of the distribution of the survival functions in the groups induced by x is tested.

The test implemented here is based on the classical logrank test, reformulated as a linear rank test. There are several ways of dealing with ties. Here, three methods are implemented. The first one (ties.method = "logrank") is described, for example, in Kalbfleisch & Prentice (2002, page 221f) or in Callaert (2003) and lead to coefficients $$a_i = \delta_i - \sum_{j: X_j \le X_i} \delta_j / (n - |{k: X_k < X_j}|)$$ for a linear rank statistic $T = \sum_{i = 1}^ n a_i U_i$ (in two-sample situations where $U_i = 0$ or $U_i = 1$ denotes the groups) with survival times $X_i$ and censoring indicator $\delta_i = 0$ for censored observations. For further details, see Kalbfleisch & Prentice (2002). The second method is described in Hothorn & Lausen (2003) where the coefficients $$a_i = \delta_i - \sum_{j: X_j \le X_i} \delta_j / (n - |{k: X_k \le X_j}| + 1)$$ are suggested. Finally, average scores (as for example used in StatXact) are offered as well.

Note, however, that the test statistics will differ from the results of survdiff since the conditional variance is not identical to the variance estimate used by the classical logrank test.

References

John D. Kalbfleisch & Ross L. Prentice (2002), The Statistical Analysis of Failure Time Data (2nd edition). John Wiley & Sons, Hoboken, New Jersey.

Herman Callaert (2003), Comparing Statistical Software Packages: The Case of the Logrank Test in StatXact. The American Statistician 57, 214--217.

Torsten Hothorn & Berthold Lausen (2003), On the Exact Distribution of Maximally Selected Rank Statistics. Computational Statistics & Data Analysis 43, 121--137.

Examples

Run this code

### asymptotic tests for carcinoma data
  surv_test(Surv(time, event) ~ stadium, data = ocarcinoma)
  survdiff(Surv(time, event) ~ stadium, data = ocarcinoma)

  ### example data given in Callaert (2003)
  exdata <- data.frame(time = c(1, 1, 5, 6, 6, 6, 6, 2, 2, 2, 3, 4, 4, 5, 5),
                       event = rep(TRUE, 15),
                       group = factor(c(rep(0, 7), rep(1, 8))))
  ### p = 0.0523
  survdiff(Surv(time, event) ~ group, data = exdata)
  ### p = 0.0505
  surv_test(Surv(time, event) ~ group, data = exdata, 
            distribution = exact())
  ### p = 0.0468
  surv_test(Surv(time, event) ~ group, data = exdata, 
            distribution = exact(), ties = "average")

  ### lung cancer example from StatXact
  `lungcancer` <- structure(list(time = c(257, 476, 355, 1779, 355, 191, 
          563, 242, 285, 16, 16, 16, 257, 16), 
      event = c(0, 0, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1), 
      group = structure(c(2L, 2L, 2L, 2L, 2L, 1L, 1L, 
          1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = c("control", "newdrug"), 
          class = "factor")), 
      .Names = c("time", "event", "group"), 
      row.names = c("1", "2", "3", "4", "5", "6", "7", "8", "9", 
                    "10", "11", "12", "13", "14"), 
      class = "data.frame")

  ### StatXact 6 manual, page 414
  logrank_trafo(Surv(lungcancer$time, lungcancer$event), 
                ties = "average")

  ### StatXact 6 manual, page 415
  surv_test(Surv(time, event) ~ group, data = lungcancer, 
            ties = "average", distribution = exact())

Run the code above in your browser using DataLab