accuracy_sifter: Accuracy calculation as defined in Engelhardt et al. (2011)
Description
Uses SIFTER's 2011 definition of accuracy, where a protein is tagged as
accurately predicted if the highest ranked prediction matches it.
Usage
accuracy_sifter(pred, lab, tol = 1e-10, highlight = "", ...)
# S3 method for aphylo_estimates
accuracy_sifter(pred, lab, tol = 1e-10, highlight = "", ...)
# S3 method for default
accuracy_sifter(pred, lab, tol = 1e-10, highlight = "", nine_na = TRUE, ...)
Value
A data frame with Ntip() rows and four variables. The variables are:
Gene: Label of the gene
Predicted: The assigned gene function.
Observed: The true set of gene functions.
Accuracy: The measurement of accuracy according to Engelhardt et al. (2011).
Arguments
pred
A matrix of predictions, or an aphylo_estimates object.
lab
A matrix of labels (0,1,NA, or 9 if nine_na = TRUE).
tol
Numeric scalar. Predictions within tol of the max score
will be tagged as the prediction made by the model (see deails).
highlight
Pattern passed to sprintf used to highlight
predicted functions that match the observed.
...
Further arguments passed to the method. In the case of aphylo_estimates,
the arguments are passed to predict.aphylo_estimates().
nine_na
Treat 9 as NA.
Details
The analysis is done at the protein level. For each protein, the function
compares the YES annotations of that proteins with the predicted by the model.
The algorithm selects the predicted annotations as those that are within
tol of the maximum score.
This algorithm doesn't take into account NOT annotations (0s), which are
excluded from the analysis.