classificationMetrics(trues,preds, metrics=NULL, benMtrx=NULL, allCls=unique(c(levels(as.factor(trues)),levels(as.factor(preds)))), posClass=allCls[1], beta=1)
allCls
).
The currently available classification performance metrics are: "acc": sum(I(t_i == p_i))/n, where I() is an indicator function given 1 if its argument is true and 0 otherwise. Note that "acc" is a value in the interval [0,1], 1 corresponding to all predictions being correct.
"err": the error rate, calculated as 1 - "acc"
"totU": this is a metric that takes into consideration not only
the fact that the predictions are correct or not, but also the costs or
benefits of these predictions. As mentioned above it assumes that the
user provides a fully specified cost/benefit matrix though parameter benMtrx
, with
benefits corresponding to correct predictions, i.e. where t_i ==
p_i, while costs correspond to erroneous predictions. These matrices are C x C square matrices, where C is the
number of possible values of the nominal target variable (i.e. the
number of classes). The entry benMtrx[x, y] represents the utility (a cost if x != y) of the model predicting x for a true value of y. The diagonal of these matrices corresponds to the
correct predictions (t_i == p_i) and should have positive values
(benefits). The positions outside of the diagonal correspond to
prediction errors and should have negative values (costs). The "totU"
measures the total Utility (sum of the costs and benefits) of the
predictions of a classification model. It is calculated as
sum(CB[p_i,t_j] * CM[p_i,t_j) where CB is a cost/benefit matrix and CM
is a confusion matrix.
"fpr": false positives rate, is a metric applicable to two classes tasks that measures the proportion of times the model forecasted a positive class when it should not and it is given by FP/N
"fnr": false negatives rate, is a metric applicable to two classes tasks that measures the proportion of times the model forecasted a negative class when it should not, and it is given by FN/P
"tpr": true positives rate, is a metric applicable to two classes tasks that measures the proportion of times the model forecasted a positive class for the positive test cases, and it is given by TP/P
"tnr": true negatives rate, is a metric applicable to two classes tasks that measures the proportion of times the model forecasted a negative class for the negative test cases, and it is given by TN/N
"rec": recall, it is equal to the true positive rate ("tpr")
"sens": sensitivity, it is equal to the true positive rate ("tpr")
"spec": specificity, it is equal to the true negative rate ("tnr")
"prec": precision, it is a metric applicable to two classes tasks that measures the proportion of times the model forecasted a positive class and it was correct, and it is given by TP/(TP+FP)
"ppv": predicted positive value, it is equal to the precision ("prec")
"fdr": false discovery rate, it is a metric applicable to two classes tasks that is given by FP/(TP+FP)
"npv": negative predicted value, it is a metric applicable to two classes tasks that is given by TN/(TN+FN)
"for": false omission rate, it is a metric applicable to two classes tasks that is given by FN/(TN+FN)
"plr": positive likelihood ratio, it is a metric applicable to two classes tasks that is given by "tpr"/"fpr" "nlr": negative likelihood ratio, it is a metric applicable to two classes tasks that is given by "fnr"/"tnr" "dor": diagnostic odds ratio, it is a metric applicable to two classes tasks that is given by "plr"/"nlr" "rpp": rate of positive predictions, it is a metric applicable to two classes tasks that measures the proportion of times the model forecasted a positive class, and it is given by (TP+FP)/N
"lift": lift, it is a metric applicable to two classes tasks and it is given by TP/P/(TP+FP) or equivalently TP/(P*TP+P*FP)
"F": the F-nmeasure, it is a metric applicable to two classes tasks that considers both the values of precision and recall weighed by a parameter Beta (defaults to 1 corresponding to equal weights to both), and it is given by (1+Beta^2)*("prec" * "rec") / ( (Beta^2 * "prec") + "rec")
"microF": micro-averaged F-measure, it is equal to accuracy ("acc")
"macroF": macro-averaged F-measure, it is the average of the F-measure scores calculated by making the positive class each of the possible class values in turn
"macroRec": macro-averaged recall, it is the average recall by making the positive class each of the possible class values in turn
"macroPrec": macro-averaged precision, it is the average precision by making the positive class each of the possible class values in turn
regressionMetrics
## Not run:
# library(DMwR)
# ## Calculating several statistics of a classification tree on the Iris data
# data(iris)
# idx <- sample(1:nrow(iris),100)
# train <- iris[idx,]
# test <- iris[-idx,]
# tree <- rpartXse(Species ~ .,train)
# preds <- predict(tree,test,type='class')
# ## Calculate the error rate
# classificationMetrics(test$Species,preds)
# ## Calculate the all possible error metrics
# classificationMetrics(test$Species,preds)
# ## Now trying calculating the utility of the predictions
# cbM <- matrix(c(10,-20,-20,-20,20,-10,-20,-10,20),3,3)
# classificationMetrics(test$Species,preds,"totU",cbM)
# ## End(Not run)
Run the code above in your browser using DataLab