Learn R Programming

CMA (version 1.30.0)

weighted.mcr: Tuning / Selection bias correction

Description

Performs subsampling for several classifiers or a single classifiers with different tuning parameter values or numbers of selected genes. Eventually, a specific procedure for correcting for the tuning or selection bias, which is caused by optimal selection of classifiers or tuning parameters, is applied.

Usage

weighted.mcr(classifiers,parameters,nbgenes,sel.method,X,y,portion,niter=100,shrinkage=F)

Arguments

classifiers
A character vector of the several CMA classifiers that shall be used. If the same classifier shall be used with different tuning parameters it must appear several times in this vector.
parameters
A character containing the tuning parameter values corresponding to the classification methods in classifiers. Must have the same length as classifiers.
nbgenes
A numeric vector indicating how many variables shall be selected by sel.method for the corresponding classifier. Must have the same length as classifiers.
sel.method
The CMA-method (represented as a string) that shall be applied for variable selection. If this parameter is set to 'none' no variable selection is performed.
X
The matrix of gene expression data. Can be one of the following. Rows correspond to observations, columns to variables.
y
Class labels. Can be one of the following:
  • A numeric vector.
  • A factor.

WARNING: The class labels will be re-coded to range from 0 to K-1, where K is the total number of different classes in the learning set.

portion
A numeric value which indicates the portion of observations that will be used for training the classifiers.
niter
The number of subsampling iterations.
shrinkage
A logical value indicating whether shrinkage (WMCS) shall be applied.

Value

wmcr.result which provides the corrected and uncorrected misclassification rate of the best classifier as well as weights and misclassifcation rates for all classifiers used in the subsampling approach.

Details

The algorithm tries to avoid the additional computational costs of a nested cross validation by estimating the corrected misclassification rate of the best classifier by a weighted mean of all classifiers included in the subsampling approach.

References

Bernau Ch., Augustin, Th. and Boulesteix, A.-L. (2011): Correcting the optimally selected resampling-based error rate: A smooth analytical alternative to nested cross-validation. Department of Statistics: Technical Reports, Nr. 105.

See Also

wmc,classification,GeneSelection, tune, evaluation,

Examples

Run this code
#inputs
classifiers<-rep('knnCMA',7)
nbgenes<-rep(50,7)
parameters<-c('k=1','k=3','k=5','k=7','k=9','k=11','k=13')
portion<-0.8
niter<-100
data(golub)
X<-as.matrix(golub[,-1])         
y<-golub[,1]
sel.method<-'t.test'
#function call
wmcr<-weighted.mcr(classifiers=classifiers,parameters=parameters,nbgenes=nbgenes,sel.method=sel.method,X=X,y=y,portion=portion,niter=niter)

Run the code above in your browser using DataLab