In many fields, decisions and outcomes are categorical even though the underlying phenomenon are probably continuous. E.g. students are accepted to graduate school or not, they finish or not. X-Rays are diagnosed as patients having cancer or not. Outcomes of such decisions are usually labeled as Valid Positives, Valid Negatives, False Positives and False Negatives. In hypothesis testing, False Positives are known as Type I errors, while False Negatives are Type II errors. The relationship between these four cells depends upon the correlation between the decision rule and the outcome as well as the level of evidence needed for a decision (the criterion). Signal Detection Theory and Decision Theory have a number of related measures of performance (accuracy = VP + VN), Sensitivity (VP/(VP + FN)), Specificity (1 - FP), d prime (d'), and the area under the Response Operating Characteristic Curve (AUC). More generally, these are examples of correlations based upon dichotomous data. AUC
addresses some of these questions.
AUC(t=NULL,BR=NULL,SR=NULL,Phi=NULL,VP=NULL,labels=NULL,plot="b",zero=TRUE,correct=.5,
col=c("blue","red"))
Phi coefficient of the two by two table
Tetrachoric (latent) coefficient inferred from the two by two table
Biserial correlation of continuous state of world with decision
The observed input (as a check)
Observed values/ total number of observations
prob / rowSums(prob)
percentage of True Positives + True Negatives
VP/(VP + FN)
VN/(FP + VN)
difference of True Positives versus True Negatives
ratio of ordinates at the decision point
a 4 x 1 vector or a 2 x2 table of TP, FP, FN, TN values (see below) May be counts or proportions.
Base Rate of successful outcomes or actual symptom (if t is not specified)
Selection Rate for candidates or diagnoses (if t is not specified)
The Phi correlation coefficient between the predictor and the outcome variable (if t is not specified)
The number of Valid Positives (selected applicants who succeed; correct diagnoses).(if t and Phi are not specified)
Names of variables 1 and 2
"b" (both), "d" (decision theory), "a" (auc), or "n" neither
If True, then the noise distribution is centered at zero
Cell values of 0 are replaced with correct. (See tetrachoric
for a discussion of why this is needed.)
The color choice for the VP and FP, defaults to =c("blue","red") but could be c("grey","black") if we want to avoid colors
William Revelle
The problem of making binary decisions about the state of the world is ubiquitous. We see this in Null Hypothesis Significance Testing (NHST), medical diagnoses, and selection for occupations. Variously known as NHST, Signal Detection Theory, clinical Assessment, or college admissions, all of these domains share the same two x two decision task.
Although the underlying phenomena are probably continuous, a typical decision or diagnostic situation makes dichotomous decisions: Accept or Reject, correctly identified, incorrectly identified. In Signal Detection Theory, the world has two states: Noise versus Signal + Noise. The decision is whether there is a signal or not.
In diagnoses, it is whether to diagnose an illness or not given some noisy signal (e.g., an X-Ray, a set of diagnostic tests).
In college admissions, we accept some students and reject others. Four-Five years later we observe who "succeeds" or graduates.
All of these decisions lead to four cells based upon a two x two categorization. Given the true state of the world is Positive or Negative, and a rater assigns positive or negative ratings, then the resulting two by two table has True (Valid) Positives and True (Valid) Negatives on the diagonal and False Positives and False Negatives off the diagonal.
When expressed as percentages of the total, then Base Rates (BR) depend upon the state of the world, but Selection Ratios (SR) are under the control of the person making the decision and affect the number of False Positives and the number of Valid Positives.
Given a two x two table of counts or percentages
Decide + | Decide - | |||
True + | Valid Positive | False Negative | Base Rate % | |
True - | False Positive | Valid Negative | 1- Base Rate |
Unfortunately, although this way of categorizing the data is typical in assessment (e.g., Wiggins 1973), and everything is expressed as percentages of the total, in some decision papers, VP are expressed as the ratio of VP to total positive decisions (e.g., Wickens, 1984). This requires dividing through by the column totals (and represented as VP* and FP* in the table below).
The relationships implied by these data can be summarized as a phi
or tetrachoric
correlation between the raters and the world, or as a decision process with several alternative measures. If we make the assumption that the two dimensions are continuous and were artificially dichotomised, then the tetrachoric
correlation is an estimate of the continuous correlation between these two latent dimensions. If we think of the data as truly representing two states e.g., vaccinated or not vaccinanated, dead or alive, then the phi
coefficient is more appropriate.
Sensitivity, Specificity, Accuracy, Area Under the Curve, and d' (d prime). These measures may be defined as
Measure | Definition | ||
Sensitivity | VP/(VP+ FN) | Specificity | |
VN/(FP + VN) | Accuracy | VP + VN | |
VP* | VP/(VP + FP) | ||
FP* | (FP/(VP + FP | d' | |
z(VP*) - z(FP*) | d' | sqrt(2) z(AUC) | |
beta | prob(X/S)/(prob(X/N)) |
Although only one point is found, we can form a graphical display of VP versus FP as a smooth curve as a function of the decision criterion. The smooth curve assumes normality whereas the other merely are the two line segments between the points (0,0), (FP,VP), (1,1). The resulting correlation between the inferred continuous state of the world and the dichotomous decision process is a biserial correlation.
When using table input, the values can be counts and thus greater than 1 or merely probabilities which should add up to 1. Base Rates and Selection Ratios are proportions and thus less than 1.
Metz, C.E. (1978) Basic principles of ROC analysis. Seminars in Nuclear Medicine, 8, 283-298.
Wiggins, Jerry S. (1973) Personality and Prediction: Principles of Personality Assessment. Addison-Wesley.
Wickens, Christopher D. (1984) Engineering Psychology and Human Performance. Merrill.
phi
, phi2tetra
,Yule
, Yule.inv
Yule2phi
, tetrachoric
and polychoric
, comorbidity
AUC(c(30,20,20,30)) #specify the table input
AUC(c(140,60,100,900)) #Metz example with colors
AUC(c(140,60,100,900),col=c("grey","black")) #Metz example 1 no colors
AUC(c(80,120,40, 960)) #Metz example 2 Note how the accuracies are the same but d's differ
AUC(c(49,40,79,336)) #Wiggins p 249
AUC(BR=.05,SR=.254,Phi = .317) #Wiggins 251 extreme Base Rates
Run the code above in your browser using DataLab