Learn R Programming

ClassifyR (version 1.6.2)

runTests: Reproducibly Do Resampling or Leave Out and Cross Validation

Description

Enables doing classification schemes such as 100 resamples 5-fold cross validation or leave one out cross validaion. Processing in parallel is possible by leveraging the package BiocParallel.

Usage

"runTests"(expression, classes, ...) "runTests"(expression, datasetName, classificationName, validation = c("bootstrap", "leaveOut"), bootMode = c("fold", "split"), resamples = 100, percent = 25, folds = 5, leave = 2, seed, parallelParams = bpparam(), params = list(SelectParams(), TrainParams(), PredictParams()), verbose = 1)

Arguments

expression
Either a matrix or ExpressionSet containing the training data. For a matrix, the rows are features, and the columns are samples.
classes
A vector the same length as the number of columns of expression data specifying the class that the samples belong to.
datasetName
A name associated with the dataset used.
classificationName
A name associated with the classification.
validation
"bootstrap" for repeated resampling or "leaveOut" for leaving all combinations of k samples as test samples.
bootMode
Character. Either "fold" or "split". If "fold", then the samples are split into folds and in each iteration one is used as the test set. If "split", the samples are split into two groups. One is used as the training set, the other is the test set.
resamples
Relevant when repeated resampling is used. The number of times to do sampling with replacement.
percent
Used when bootstrap resampling with split method is chosen. The percentage of samples to be in the test set.
folds
Relevant when repeated resampling is used with fold mode. The number of folds to break each resampling into. Each fold is used once as the test set.
leave
Relevant when leave k out validation is used. The number of samples to leave for testing.
seed
The random number generator used for repeated resampling will use this seed, if it is provided. Allows reproducibility of repeated usage on the same input data.
parallelParams
An object of class MulticoreParam or SnowParam.
params
A list of objects of class of TransformParams, SelectParams, TrainParams, or PredictParams. The order they are in the list determines the order in which the stages of classification are done in.
...
Unused variables from the matrix method passed to the ExpressionSet method.
verbose
A number between 0 and 3 for the amount of progress messages to give. A higher number will produce more messages.

Value

If the predictor function made a single prediction, then an object of class ClassifyResult. If the predictor function made a set of predictions, then a list of such objects.

Examples

Run this code
  if(require(curatedOvarianData) && require(sparsediscrim))
  {
    data(TCGA_eset)
    badOutcome <- which(pData(TCGA_eset)[, "vital_status"] == "deceased" & pData(TCGA_eset)[, "days_to_death"] <= 365)
    goodOutcome <- which(pData(TCGA_eset)[, "vital_status"] == "living" & pData(TCGA_eset)[, "days_to_death"] >= 365 * 5)
    TCGA_eset <- TCGA_eset[, c(badOutcome, goodOutcome)]
    classes <- factor(rep(c("Poor", "Good"), c(length(badOutcome), length(goodOutcome))))
    pData(TCGA_eset)[, "class"] <- classes
    runTests(TCGA_eset, "Ovarian Cancer", "Differential Expression", resamples = 2, fold = 2)
  }

Run the code above in your browser using DataLab