Learn R Programming

MXM (version 0.9.7)

Conditional independence tests for survival data : Conditional independence test for survival data

Description

The main task of this test is to provide a p-value PVALUE for the null hypothesis: feature 'X' is independent from 'TARGET' given a conditioning set CS. This test can based on the Cox (semi-parametric) regression or on the Weibull (parametric) regression.

Usage

censIndCR(target, dataset, xIndex, csIndex, wei = NULL, dataInfo = NULL, univariateModels = NULL, hash = FALSE, stat_hash = NULL, pvalue_hash = NULL, robust = FALSE)
censIndWR(target, dataset, xIndex, csIndex, wei = NULL, dataInfo = NULL, univariateModels = NULL, hash = FALSE, stat_hash = NULL, pvalue_hash = NULL, robust = FALSE)
censIndER(target, dataset, xIndex, csIndex, wei = NULL, dataInfo = NULL, univariateModels = NULL, hash = FALSE, stat_hash = NULL, pvalue_hash = NULL, robust = FALSE)

Arguments

target
A Survival object (class Surv from package survival) containing the time to event data (time) and the status indicator vector (event). View Surv documentation for more information.
dataset
A numeric matrix or data frame, in case of categorical predictors (factors), containing the variables for performing the test. Rows as samples and columns as features.
xIndex
The index of the variable whose association with the target we want to test.
csIndex
The indices of the variables to condition on.
wei
A vector of weights to be used for weighted regression. The default value is NULL.
dataInfo
A list object with information on the structure of the data. Default value is NULL.
univariateModels
Fast alternative to the hash object for univariate test. List with vectors "pvalues" (p-values), "stats" (statistics) and "flags" (flag = TRUE if the test was succesful) representing the univariate association of each variable with the target. Default value is NULL.
hash
A boolean variable which indicates whether (TRUE) or not (FALSE) to use the hash-based implementation of the statistics of SES. Default value is FALSE. If TRUE you have to specify the stat_hash argument and the pvalue_hash argument.
stat_hash
A hash object (hash package required) which contains the cached generated statistics of a SES run in the current dataset, using the current test.
pvalue_hash
A hash object (hash package required) which contains the cached generated p-values of a SES run in the current dataset, using the current test.
robust
A boolean variable which indicates whether (TRUE) or not (FALSE) to use a robustified version of Cox regression. Currently the robust version is not available for this test. Note, that Cox and Weibull regressions offer robust (sandwich) estimation of the standard error of the coefficients, but not robust estimation of the parameters.

Value

A list including: A list including:

Details

The censIndCR implies the Cox (semiparametric) regression, the censIndWR the Weibull (parametric) regression and the censIndER the exponential (parametric) regression, which is a special case of the Weibull regression (when shape parameter is 1).

If hash = TRUE, censIndCR, censIndWR and censIndER require the arguments 'stat_hash' and 'pvalue_hash' for the hash-based implementation of the statistic test. These hash Objects are produced or updated by each run of SES (if hash == TRUE) and they can be reused in order to speed up next runs of the current statistic test. If "SESoutput" is the output of a SES run, then these objects can be retrieved by SESoutput@hashObject$stat_hash and the SESoutput@hashObject$pvalue_hash.

Important: Use these arguments only with the same dataset that was used at initialization.

For all the available conditional independence tests that are currently included on the package, please see "?CondIndTests".

References

V. Lagani and I. Tsamardinos (2010). Structure-based variable selection for survival data. Bioinformatics Journal 16(15): 1887-1894.

Cox,D.R. (1972) Regression models and life-tables. J. R. Stat. Soc., 34, 187-220.

See Also

SES, censIndWR, testIndFisher, gSquare, testIndLogistic, Surv, anova, CondIndTests

Examples

Run this code
#create a survival simulated dataset
dataset <- matrix(runif(1000 * 20, 1, 100), nrow = 1000 , ncol = 20)
dataset <- as.data.frame(dataset);
timeToEvent <- numeric(1000)
event <- numeric(1000)
ca <- numeric(1000)
for(i in 1:1000) {
  timeToEvent[i] <- dataset[i, 1] + 0.5 * dataset[i, 10] + 2 * dataset[i, 15] + runif(1, 0, 1);
  event[i] <- sample( c(0, 1), 1)
  ca[i] <- runif(1, 0, timeToEvent[i]-0.5)
  if(event[i] == 0) {
    timeToEvent[i] = timeToEvent[i] - ca[i]
  }
}

require(survival, quietly = TRUE)

#init the Surv object class feature
  target <- Surv(time = timeToEvent, event = event)
  
  #run the censIndCR   conditional independence test
  res <- censIndCR( target, dataset, xIndex = 12, csIndex = c(5, 7, 4) )
  res
  
  #run the SES algorithm using the censIndCR conditional independence
  #test for the survival class variable
  ses1 <- SES(target, dataset, max_k = 1, threshold = 0.05, test = "censIndCR");
  ses2 <- SES(target, dataset, max_k = 1, threshold = 0.05, test = "censIndWR");
  ses3 <- SES(target, dataset, max_k = 1, threshold = 0.05, test = "censIndER");

Run the code above in your browser using DataLab