Produces up to four plots (selectable by "which"
) from the
results of a call to roc
, including the ROC curve
itself.
# S3 method for roc
plot(x,
which = c(1:3,5),
group = "Combined",
prior = NULL,
show.stats = TRUE,
abline.col = "grey",
abline.lty = "dashed",
inGroup.col = "red",
outGroup.col = "blue",
lty = "solid",
caption = c("ROC curve", "Dissimilarity profiles",
"TPF - FPF vs Dissimilarity",
"Likelihood ratios"),
legend = "topright",
ask = prod(par("mfcol")) < length(which) && dev.interactive(),
...)
One or more plots, drawn on the current device.
an object of class "roc"
.
numeric vector; which aspects of "roc"
object to
plot if a subset of the plots is required, specify a subset of the
numbers 1:5
.
character vector of length 1 giving the name of the group to plot.
numeric vector of length 2 (e.g. c(0.5, 0.5)
)
specifiying the prior probabilities of analogue and
no-analogue. Used to generate posterior probability of analogue
using Bayes factors in plot 5 (which = 5
).
logical; should concise summary statistics of the ROC analysis be displayed on the plot?
character string or numeric value; the colour used to draw vertical lines at the optimal dissimilarity on the plots.
Line type for indicator of optimal ROC dissimilarity
threshold. See par
for the allowed line types.
character string or numeric value; the colour used to draw the density curve for the dissimilarities between sites in the same group.
character string or numeric value; the colour used to draw the density curve for the dissimilarities between sites not in the same group.
vector of at most length 2 (elements past the second in
longer vectors are ignored) line types. The first element of
lty
will be used where a single line is drawn on a
plot. Where two lines are drawn (for analogue and non-analogue
cases), the first element pertains to the analogue group and the
second element to the non-analogue group. See par
for
the allowed line types.
vector of character strings, containing the captions to appear above each plot.
character; position of legends drawn on plots. See
Details section in legend
for keywords that can be
specified.
logical; if TRUE
, the user is asked before
each plot, see par(ask=.)
.
graphical arguments passed to other graphics functions.
Gavin L. Simpson. Code borrows heavily from plot.lm
.
This plotting function is modelled closely on plot.lm
and many of the conventions and defaults for that function are
replicated here.
First, some definitions:
True Positive Fraction, also known as sensitivity.
True Negative Fraction, also known as specificity.
False Positive Fraction; the complement of TNF, calculated as 1 - TNF. This is often referred to a 1 - specificity. A false positive is also known as a type I error.
False Negative Fraction; the complement of TPF, calculated as 1 - TPF. A false negative is also known as a type II error.
The Area Under the ROC Curve.
The "ROC curve" plot (which = 1
,) draws the ROC curve itself as
a plot of the False Positive Fraction against the True Positive
Fraction. A diagonal 1:1 line represents no ability for the
dissimilarity coefficient to differentiate between groups. The AUC
statistic may also be displayed (see argument "show.stats"
above).
The "Dissimilarity profile" plot (which = 2
), draws the density
functions of the dissimilarity values (d) for the correctly
assigned samples and the incorrectly assigned samples. A dissimilarity
coefficient that is able to well distinguish the sample groupings will
have density functions for the correctly and incorrectly assigned
samples that have little overlap. Conversely, a poorly discriminating
dissimilarity coefficient will have density profiles for the two
assignments that overlap considerably. The point where the two curves
cross is the optimal dissimilarity or critical value, d'. This
represents the point where the difference between TPF and FPF is
maximal. The value of d at the point where the difference
between TPF and FPF is maximal will not neccesarily be the
same as the value of d' where the density profiles cross. This
is because the ROC curve has been estimated at discrete points
d, which may not include excatly the optimal d', but
which should be close to this value if the ROC curve is not sampled on
too coarse an interval.
The "TPF - FPF vs Dissimilarity" plot, draws the difference between the ROC curve and the 1:1 line. The point where the ROC curve is farthest from the 1:1 line is the point at which the ROC curve has maximal slope. This is the optimal value for d, as discussed above.
The "Likelihood ratios" plot, draws two definitions of the slope of the ROC curve as the likelihood functions LR(+), and LR(-). LR(+), is the likelihood ratio of a positive test result, that the value of d assigns the sample to the group it belongs to. LR(-) is the likelihood ratio of a negative test result, that the value of d assigns the sample to the wrong group.
LR(+) is defined as \(LR(+) = TPF / FPF\) (or sensitivity / (1 - specificity)), and LR(-) is defined as \(LR(-) = FPF / TNF\) (or (1 - sensitivity) / specificity), in Henderson (1993).
The “probability of analogue” plot, draws the posterior probability of analogue given a dissimilarity. This is the LR(+) likelihood ratio values multiplied by the prior odds of analogue, for given values of the dissimilarity, and is then converted to a probability.
Brown, C.D., and Davis, H.T. (2006) Receiver operating characteristics curves and related decision measures: A tutorial. Chemometrics and Intelligent Laboratory Systems 80, 24--38.
Gavin, D.G., Oswald, W.W., Wahl, E.R. and Williams, J.W. (2003) A statistical approach to evaluating distance metrics and analog assignments for pollen records. Quaternary Research 60, 356--367.
Henderson, A.R. (1993) Assessing test accuracy and its clinical consequences: a primer for receiver operating characteristic curve analysis. Annals of Clinical Biochemistry 30, 834--846.
roc
for a complete example