pauclog: Calculates the p-values

Description

This auxiliary function calculates the logarithm of p-values of the statistical significance test of the difference of samples from two classes using AUC values (for each input feature). It takes as an input the results of the AUC value calculation by the function compute.aucs. It can be reasonably used only for two-class problem. The results is in the form of “numeric vector” with the logarithms of the p-values for each features.

Usage

pauclog(auc,n=100,n.plus=0.5,labels=numeric(),pos=numeric())

Arguments

auc

a numeric vector of AUC values.

the whole number of observations for the test.

n.plus

the number of cases in the sample with the positive class.

labels

the factor with the class labels.

pos

the numeric vector with the level of the positive class.

Value

A returned data consists is the following:

pauclog

a numeric vector with the logarithm of p-value for each feature

Details

This auxiliary function's main job is to calculate the logarithm of p-values of the statistical significance test of two samples, defined by negative and positive class labels, i.e. two-class problem. See the “Value” section to this page for more details.

References

David J. Hand and Robert J. Till (2001). A Simple Generalisation of the Area Under the ROC Curve for Multiple Class Classification Problems. Machine Learning 45(2), p. 171<U+2013>186.

Examples

Run this code

# NOT RUN {
# example
data(datasetF6)

# class label must be factor
datasetF6[,ncol(datasetF6)]<-as.factor(datasetF6[,ncol(datasetF6)])

auc.val=compute.aucs(dattable=datasetF6)
vauc<-auc.val[,"AUC"]
val=levels(datasetF6[,ncol(datasetF6)])

if(length(val)==2)
{
	 pos=auc.val[,"Positive class"]
	 paucv<-pauclog(auc=vauc,labels=datasetF6[,ncol(datasetF6)],pos=pos)
}else{
	 num.size=100
	 num.prop=0.5
	 paucv<-pauclog(auc=vauc,n=num.size,n.plus=num.prop)
}
# }

Run the code above in your browser using DataLab