Learn R Programming

kebabs (version 1.2.3)

evaluatePrediction: Evaluate Prediction

Description

Evaluate performance results of prediction on a testset based on given labels for binary classification

Usage

evaluatePrediction(prediction, label, allLabels = NULL, decValues = NULL, print = TRUE, confmatrix = TRUE, numPrecision = 3, numPosNegTrainSamples = numeric(0))

Arguments

prediction
prediction results as returned by predict for predictionType="response".
label
label vector of same length as parameter 'prediction'.
allLabels
vector containing all occuring labels once. This parameter is required only if the label vector is numeric. Default=NULL
decValues
numeric vector containing decision values for the predictions as returned by the predict method with predictionType set to decision. This parameter is needed for the determination of the AUC value which is currently only supported for binary classification. Default=NULL
print
This parameter indicates whether performance values should be printed or returned as data frame without printing (for details see below). Default=TRUE
confmatrix
When set to TRUE a confusion matrix is printed. The rows correspond to predictions, the columns to the true labels. Default=TRUE
numPrecision
minimum number of digits to the right of the decimal point. Values between 0 and 20 are allowed. Default=3
numPosNegTrainSamples
optional integer vector with two values giving the number of positive and negative training samples. When this parameter is set the balancedness of the training set is reported. Default=numeric(0)

Value

When the parameter 'print' is set to FALSE the function returns a data frame containing the prediction performance values (for details see above).

Details

For binary classfication this function computes the performance measures accuracy, balanced accuracy, sensitivity, specificity, precision and the Matthews Correlation Coefficient(MCC). If decision values are passed in the parameter decValues the function additionally determines the AUC. When the number of positive and negative training samples is passed to the function it also shows the balancedness of the training set. The performance results are either printed by the routine directly or returned in a data frame. The columns of the data frame are:

column name
performance measure
--------------------
--------------
TP
true positive
FP
false positive
FN
false negative
TN
true negative
ACC
accuracy
BAL_ACC
balanced accuracy
SENS
sensitivity
SPEC
specificity
PREC
precision
MAT_CC
Matthews correlation coefficient
AUC
area under ROC curve
PBAL
prediction balancedness (fraction of positive samples)
TBAL
training balancedness (fraction of positive samples)

References

http://www.bioinf.jku.at/software/kebabs J. Palme, S. Hochreiter, and U. Bodenhofer (2015) KeBABS: an R package for kernel-based analysis of biological sequences. Bioinformatics (accepted). DOI: 10.1093/bioinformatics/btv176.

See Also

predict, kbsvm

Examples

Run this code
## set seed for random generator, included here only to make results
## reproducable for this example
set.seed(456)
## load transcription factor binding site data
data(TFBS)
enhancerFB
## select 70% of the samples for training and the rest for test
train <- sample(1:length(enhancerFB), length(enhancerFB) * 0.7)
test <- c(1:length(enhancerFB))[-train]
## create the kernel object for gappy pair kernel with normalization
gappy <- gappyPairKernel(k=1, m=3)
## show details of kernel object
gappy

## run training with explicit representation
model <- kbsvm(x=enhancerFB[train], y=yFB[train], kernel=gappy,
               pkg="LiblineaR", svm="C-svc", cost=80, explicit="yes",
               featureWeights="no")

## predict the test sequences
pred <- predict(model, enhancerFB[test])

## print prediction performance
evaluatePrediction(pred, yFB[test], allLabels=unique(yFB))

## Not run: 
# ## print prediction performance including AUC
# ## additionally determine decision values
# preddec <- predict(model, enhancerFB[test], predictionType="decision")
# evaluatePrediction(pred, yFB[test], allLabels=unique(yFB),
#                    decValues=preddec)
# 
# ## print prediction performance including training set balance
# trainPosNeg <- c(length(which(yFB[train] == 1)),
#                  length(which(yFB[train] == -1)))
# evaluatePrediction(pred, yFB[test], allLabels=unique(yFB),
#                    numPosNegTrainSamples=trainPosNeg)
# 
# ## or get prediction performance as data frame
# perf <- evaluatePrediction(pred, yFB[test], allLabels=unique(yFB),
#                            print=FALSE)
# 
# ## show performance values in data frame
# perf
# ## End(Not run)

Run the code above in your browser using DataLab