Learn R Programming

Biocomb (version 0.4)

CalculateHUM_seq: Calculate HUM value

Description

This function calculates the features weights using the HUM (hypervolume under manifold) values criterion measure and is used for ranking the features (in decreasing order of HUM values). HUM values are the extension of the AUC values for more than two classes. It can handle only numerical values. It computes a HUM value and returns a “List” object, consisting of HUM value and the best permutation of class labels in “seq” vector. This “seq” vector can be passed to the function CalculateHUM_ROC for the calculating the coordinates of the 2D or 3D ROC. This function is used internally to perform the classification with feature selection using the function “classifier.loop” with argument “HUM” for feature selection.

Usage

CalculateHUM_seq(data,indexF,indexClass,indexLabel)

Arguments

data

a dataset, a matrix of feature values for several cases, the additional column with class labels is provided. Class labels could be numerical or character values. The maximal number of classes is ten. The indexClass determines the column with class labels.

indexF

a numeric or character vector, containing the column numbers or column names of the analyzed features.

indexClass

a numeric or character value, containing the column number or column name of the class labels.

indexLabel

a character vector, containing the column names of the class labels, selected for the analysis.

Value

The data can be provided with reasonable number of missing values that must be at first preprocessed with one of the imputing methods in the function input_miss. A returned list consists of th the following fields:

HUM

a list of HUM values for the specified number of analyzed features

seq

a list of vectors, each containing the sequence of class labels

Details

This function's main job is to compute the maximal HUM value between the all possible permutations of class labels, selected for analysis. See the “Value” section to this page for more details. Before returning, it will call the CalcGene function to calculate the HUM value for each feature (object).

Data can be provided in matrix form, where the rows correspond to cases with feature values and class label. The columns contain the values of individual features and the separate column contains class labels. The maximal number of class labels equals 10. The computational efficiency of the function descrease in the case of more than 1000 cases with more than 6 class labels. In order to use all the functions of the package it is necessary to put the class label in the last column of the dataset.The class label features must be defined as factors.

References

Li, J. and Fine, J. P. (2008): ROC Analysis with Multiple Tests and Multiple Classes: methodology and its application in microarray studies.Biostatistics. 9 (3): 566-576. Natalia Novoselova, Cristina Della Beffa, Junxi Wang, Jialiang Li, Frank Pessler, Frank Klawonn. HUM Calculator and HUM package for R: easy-to-use software tools for multicategory receiver operating characteristic analysis<U+00BB> / Bioinformatics. <U+2013> 2014. <U+2013> Vol. 30 (11): 1635-1636 doi:10.1093/ bioinformatics/btu086.

See Also

CalculateHUM_Ex, CalculateHUM_ROC

Examples

Run this code
# NOT RUN {
data(leukemia72)
# Basic example
# class label must be factor
leukemia72[,ncol(leukemia72)]<-as.factor(leukemia72[,ncol(leukemia72)])

xdata=leukemia72
indexF=1:2
indexClass=ncol(xdata)
label=levels(xdata[,indexClass])
indexLabel=label[1:2]

out=CalculateHUM_seq(xdata,indexF,indexClass,indexLabel)

# }

Run the code above in your browser using DataLab