This function calculates the features weights using the HUM (hypervolume under manifold) values criterion measure and is used for ranking the features (in decreasing order of HUM values). HUM values are the extension of the AUC values for more than two classes.
It can handle only numerical values.
It computes a HUM value and returns a “List” object, consisting of HUM value and the best permutation of class labels in “seq” vector. This “seq” vector can be passed to the function CalculateHUM_ROC
for the calculating the coordinates of the 2D or 3D ROC.
This function is used internally to perform the classification with feature selection using the function “classifier.loop” with argument “HUM” for feature selection.
CalculateHUM_seq(data,indexF,indexClass,indexLabel)
a dataset, a matrix of feature values for several cases, the additional column with class labels is provided. Class labels could be numerical or character values. The maximal number of classes is ten. The indexClass
determines the column with class labels.
a numeric or character vector, containing the column numbers or column names of the analyzed features.
a numeric or character value, containing the column number or column name of the class labels.
a character vector, containing the column names of the class labels, selected for the analysis.
The data can be provided with reasonable number of missing values that must be at first preprocessed with one of the imputing methods in the function input_miss
.
A returned list consists of th the following fields:
a list of HUM values for the specified number of analyzed features
a list of vectors, each containing the sequence of class labels
This function's main job is to compute the maximal HUM value between the all possible permutations of class labels, selected for analysis. See the
“Value” section to this page for more details. Before
returning, it will call the CalcGene
function to calculate the HUM value for each feature (object).
Data can be provided in matrix form, where the rows correspond to cases with feature values and class label. The columns contain the values of individual features and the separate column contains class labels. The maximal number of class labels equals 10. The computational efficiency of the function descrease in the case of more than 1000 cases with more than 6 class labels. In order to use all the functions of the package it is necessary to put the class label in the last column of the dataset.The class label features must be defined as factors.
Li, J. and Fine, J. P. (2008): ROC Analysis with Multiple Tests and Multiple Classes: methodology and its application in microarray studies.Biostatistics. 9 (3): 566-576. Natalia Novoselova, Cristina Della Beffa, Junxi Wang, Jialiang Li, Frank Pessler, Frank Klawonn. HUM Calculator and HUM package for R: easy-to-use software tools for multicategory receiver operating characteristic analysis<U+00BB> / Bioinformatics. <U+2013> 2014. <U+2013> Vol. 30 (11): 1635-1636 doi:10.1093/ bioinformatics/btu086.
# NOT RUN {
data(leukemia72)
# Basic example
# class label must be factor
leukemia72[,ncol(leukemia72)]<-as.factor(leukemia72[,ncol(leukemia72)])
xdata=leukemia72
indexF=1:2
indexClass=ncol(xdata)
label=levels(xdata[,indexClass])
indexLabel=label[1:2]
out=CalculateHUM_seq(xdata,indexF,indexClass,indexLabel)
# }
Run the code above in your browser using DataLab