coords: Coordinates of a ROC curve

Description

This function returns the coordinates of the ROC curve at the specified point.

Usage

coords(...)
# S3 method for roc
coords(roc, x, input=c("threshold", "specificity",
"sensitivity"), ret=c("threshold", "specificity", "sensitivity"),
as.list=FALSE, drop=TRUE, best.method=c("youden", "closest.topleft"),
best.weights=c(1, 0.5), ...)
# S3 method for smooth.roc
coords(smooth.roc, x, input=c("specificity",
"sensitivity"), ret=c("specificity", "sensitivity"), as.list=FALSE,
drop=TRUE, best.method=c("youden", "closest.topleft"), 
best.weights=c(1, 0.5), ...)

Arguments

roc, smooth.roc

a “roc” object from the roc function, or a “smooth.roc” object from the smooth function.

the coordinates to look for. Numeric (if so, their meaning is defined by the input argument) or one of “all” (all the points of the ROC curve), “local maximas” (the local maximas of the ROC curve) or “best” (the point with the best sum of sensitivity and specificity).

input

If x is numeric, the kind of input coordinate (x). One of “threshold”, “specificity” or “sensitivity”. Can be shortenend (for example to “thr”, “sens” and “spec”, or even to “t”, “se” and “sp”). Note that “threshold” is not allowed in coords.smooth.roc, and that the argument is ignored when x is a character.

ret

The coordinates to return. One or more of “threshold”, “specificity”, “sensitivity”, “accuracy”, “tn” (true negative count), “tp” (true positive count), “fn” (false negative count), “fp” (false positive count), “npv” (negative predictive value), “ppv” (positive predictive value), “precision”, “recall”. “1-specificity”, “1-sensitivity”, “1-accuracy”, “1-npv” and “1-ppv” are recognized as well, and must be used verbatim in ROC curves with percent=TRUE (for instance “100-ppv” is never accepted). Values can be shortenend (for example to “thr”, “sens” and “spec”, or even to “se”, “sp” or “1-np”). Note that “threshold” is not allowed in coords.smooth.roc. In addition, “npe” is replaced by “1-npv” and “ppe” by “1-ppv” (but they cannot be shortened).

as.list

If the returned object must be a list. If FALSE (default), a named numeric vector is returned.

drop

If TRUE the result is coerced to the lowest possible dimension, as per Extract. With FALSE if x is of length 1, the object returned will have the same format than if x was of length > 1.

best.method

if x="best", the method to determine the best threshold. See details in the ‘Best thresholds’ section.

best.weights

if x="best", the weights to determine the best threshold. See details in the ‘Best thresholds’ section.

…

further arguments passed to or from other methods. Ignored.

Value

Depending on the length of x and as.list argument.

length(x) == 1 length(x) > 1 or drop == FALSE

as.list=TRUE a list of the length of, in the order of, and named after, ret. a list of the length of, and named after, x. Each element of this list is a list of the length of, in the order of, and named after, ret.

as.list=FALSE

a numeric vector of the length of, in the order of, and named after, ret.

a numeric matrix with one row for each ret and one column for each x

In all cases if input="specificity" or input="sensitivity" and interpolation was required, threshold is returned as NA.

Note that if giving a character as x (“all”, “local maximas” or “best”), you cannot predict the dimension of the return value unless drop=FALSE. Even “best” may return more than one value (for example if the ROC curve is below the identity line, both extreme points).

coords may also return NULL when there a partial area is defined but no point of the ROC curve falls within the region.

Best thresholds

If x="best", the best.method argument controls how the optimal threshold is determined.

“youden”

Youden's J statistic (Youden, 1950) is employed. The optimal cut-off is the threshold that maximizes the distance to the identity (diagonal) line. Can be shortened to “y”.

The optimality criterion is: $$max(sensitivities + specificities)$$

“closest.topleft”

The optimal threshold is the point closest to the top-left part of the plot with perfect sensitivity or specificity. Can be shortened to “c” or “t”.

The optimality criterion is: $$min((1 - sensitivities)^2 + (1- specificities)^2)$$

In addition, weights can be supplied if false positive and false negative predictions are not equivalent: a numeric vector of length 2 to the best.weights argument. The elements define

the relative cost of of a false negative classification (as compared with a false positive classification)
the prevalence, or the proportion of cases in the population ($\frac{n_{cases}}{n_{controls}+n_{cases}}$).

The optimality criteria are modified as proposed by Perkins and Schisterman:

“youden”: $$max(sensitivities + r * specificities)$$
“closest.topleft”: $$min((1 - sensitivities)^2 + r * (1- specificities)^2)$$

with

$$r = \frac{1 - prevalence}{cost * prevalence}$$

By default, prevalence is 0.5 and cost is 1 so that no weight is applied in effect.

Note that several thresholds might be equally optimal.

Details

This function takes a “roc” or “smooth.roc” object as first argument, on which the coordinates will be determined. The coordinates are defined by the x and input arguments. “threshold” coordinates cannot be determined in a smoothed ROC.

If input="threshold", the coordinates for the threshold are reported, even if the exact threshold do not define the ROC curve. The following convenience characters are allowed: “all”, “local maximas” and “best”. They will return all the thresholds, only the thresholds defining local maximas (upper angles of the ROC curve), or only the threshold(s) corresponding to the best sum of sensitivity + specificity respectively. Note that “best” can return more than one threshold. If x is a character, the coordinates are limited to the thresholds within the partial AUC if it has been defined, and not necessarily to the whole curve.

For input="specificity" and input="sensitivity", the function checks if the specificity or sensitivity is one of the points of the ROC curve (in roc$sensitivities or roc$specificities). More than one point may match (in step curves), then only the upper-left-most point coordinates are returned. Otherwise, the specificity and specificity of the point is interpolated and NA is returned as threshold.

The coords function in this package is a generic, but it might be superseded by functions in other packages such as colorspace or spatstat if they are loaded after pROC. In this case, call the coords.roc or coords.smooth.roc functions directly.

References

Neil J. Perkins, Enrique F. Schisterman (2006) ``The Inconsistency of "Optimal" Cutpoints Obtained using Two Criteria based on the Receiver Operating Characteristic Curve''. American Journal of Epidemiology 163(7), 670--675. DOI: 10.1093/aje/kwj063.

Xavier Robin, Natacha Turck, Alexandre Hainard, et al. (2011) ``pROC: an open-source package for R and S+ to analyze and compare ROC curves''. BMC Bioinformatics, 7, 77. DOI: 10.1186/1471-2105-12-77.

W. J. Youden (1950) ``Index for rating diagnostic tests''. Cancer, 3, 32--35. DOI: 10.1002/1097-0142(1950)3:1<32::AID-CNCR2820030106>3.0.CO;2-3.

Examples

Run this code

# NOT RUN {
data(aSAH)

# Print a roc object:
rocobj <- roc(aSAH$outcome, aSAH$s100b)

coords(rocobj, 0.55)
coords(rocobj, 0.9, "specificity", as.list=TRUE)
coords(rocobj, 0.5, "se", ret="se")
# fully qualified but identical:
coords(roc=rocobj, x=0.5, input="sensitivity", ret="sensitivity")

# Compare with drop=FALSE
coords(rocobj, 0.55, drop=FALSE)
coords(rocobj, 0.9, "specificity", as.list=TRUE, drop=FALSE)

# Same in percent
rocobj <- roc(aSAH$outcome, aSAH$s100b, percent=TRUE)

coords(rocobj, 0.55)
coords(rocobj, 90, "specificity", as.list=TRUE)
coords(rocobj, x=50, input="sensitivity", ret=c("sen", "spec"))

# And with a smoothed ROC curve
coords(smooth(rocobj), 90, "specificity")
coords(smooth(rocobj), 90, "specificity", drop=FALSE)
coords(smooth(rocobj), 90, "specificity", as.list=TRUE)
coords(smooth(rocobj), 90, "specificity", as.list=TRUE, drop=FALSE)

# Get the sensitivities for all thresholds
sensitivities <- coords(rocobj, rocobj$thresholds, "thr", "se")
# This is equivalent to taking sensitivities from rocobj directly
stopifnot(all.equal(as.vector(rocobj$sensitivities), as.vector(sensitivities)))
# You could also write:
sensitivities <- coords(rocobj, "all", ret="se")
stopifnot(all.equal(as.vector(rocobj$sensitivities), as.vector(sensitivities)))

# Get the best threshold
coords(rocobj, "b", ret="t")

# Get the best threshold according to different methods
rocobj <- roc(aSAH$outcome, aSAH$ndka, percent=TRUE)
coords(rocobj, "b", ret="t", best.method="youden") # default
coords(rocobj, "b", ret="t", best.method="closest.topleft")
# and with different weights
coords(rocobj, "b", ret="t", best.method="youden", best.weights=c(50, 0.2))
coords(rocobj, "b", ret="t", best.method="closest.topleft", best.weights=c(5, 0.2))
# and plot them
plot(rocobj, print.thres="best", print.thres.best.method="youden")
plot(rocobj, print.thres="best", print.thres.best.method="closest.topleft")
plot(rocobj, print.thres="best", print.thres.best.method="youden",
                                 print.thres.best.weights=c(50, 0.2)) 
plot(rocobj, print.thres="best", print.thres.best.method="closest.topleft",
                                 print.thres.best.weights=c(5, 0.2)) 

# Return more values:
coords(rocobj, "best", ret=c("threshold", "specificity", "sensitivity", "accuracy",
                           "tn", "tp", "fn", "fp", "npv", "ppv", "1-specificity",
                           "1-sensitivity", "1-accuracy", "1-npv", "1-ppv",
                           "precision", "recall"))
coords(smooth(rocobj), "best", ret=c("threshold", "specificity", "sensitivity", "accuracy",
                           "tn", "tp", "fn", "fp", "npv", "ppv", "1-specificity",
                           "1-sensitivity", "1-accuracy", "1-npv", "1-ppv",
                           "precision", "recall"))
coords(smooth(rocobj), 0.5, ret=c("threshold", "specificity", "sensitivity", "accuracy",
                           "tn", "tp", "fn", "fp", "npv", "ppv", "1-specificity",
                           "1-sensitivity", "1-accuracy", "1-npv", "1-ppv",
                           "precision", "recall"))
                           
# You can use coords to plot for instance a sensitivity + specificity vs. cut-off diagram

plot(specificity + sensitivity ~ threshold, t(coords(rocobj, seq(0, 1, 0.01))), type = "l")

# Plot the Precision-Recall curve
plot(precision ~ recall, t(coords(rocobj, "all", ret = c("recall", "precision"))), type="l")

# }

Run the code above in your browser using DataLab