Learn R Programming

MRIaggr (version 1.1.5)

calcAUPRC: Area under the PR curve

Description

Compute the area under the precision recall curve by numerical integration.

Usage

calcAUPRC(x, y, subdivisions = 10000, performance = NULL, ci = TRUE, alpha = 0.05, 
          method = "Kronrod", reltol = .Machine$double.eps^0.25)

Arguments

x
the biomarker values. numeric vector. REQUIRED.
y
the class labels. numeric vector, character vector or logical vector. REQUIRED.
subdivisions
the maximum number of subintervals used for the integration. positive integer. Only used if method="integrate".
performance
an object of class performance can be supplied instead of arguments x and y.
ci
should the confidence interval be computed ? logical.
alpha
the type 1 error rate. numeric.
method
the integration method used to compute the area under the curve. Any of "integrate", "Kronrod", "Richardson" "Clenshaw", "Simpson" or "Romberg".
reltol
the relative accuracy requested. Positive numeric.

Value

  • If ci=FALSE a numeric between 0 and 1. If ci=TRUE a numeric vector of length 3 containing the punctual estimation, the lower and the upper bound of the confidence interval.

concept

const.

Details

This function requires to have installed the ROCR package to work. The numeric integration of the precision over the recall values can be performed either using the integrate function of the stats package (if method="integrate") or using the integral function of the pracma package. In the latter case, the method argument is used to define the integration procedure (see the documentation of integral for more details). The confidence interval is computed using the first order delta method and the logistic transformation : $$IC(AUPRC) = \left[ \frac{e^{\mu_\eta - 1.96 \tau}}{1+ e^{\mu_\eta - 1.96 \tau}} \; ; \; \frac{e^{\mu_\eta + 1.96 \tau}}{1+ e^{\mu_\eta + 1.96 \tau}} \right]$$ $$\mu_\eta = logit(\widehat{AUPRC})$$ $$\tau = \frac{1}{\sqrt{n*\widehat{AUPRC}*(1-\widehat{AUPRC})}}$$ See section 3.2 of (Boyd, 2013) for more details. ARGUMENTS: y must have exactly two levels. If performance is set to NULL, the code{x} and y will be used to form the performance object.

References

Kendrick Boyd1, Kevin H. Eng, and C. David Page. Area Under the Precision-Recall Curve: Point Estimates and Confidence Intervals. Machine Learning and Knowledge Discovery in Databases, 2013:451-466.

Examples

Run this code
data(MRIaggr.Pat1_red, package = "MRIaggr")

## select parameter and binary outcome
cartoT2 <- selectContrast(MRIaggr.Pat1_red, param = "T2_FLAIR_t2", format = "vector")
cartoMASK <- selectContrast(MRIaggr.Pat1_red, param = "MASK_T2_FLAIR_t2", format = "vector")

## compute AUPRC
T2.AUPRC <- calcAUPRC(x = cartoT2, y = cartoMASK)

## compute AUC
if(require(pROC)){
T2.AUC <- auc(roc(cartoMASK ~ cartoT2))
} 


## display
multiplot(MRIaggr.Pat1_red,param = "T2_FLAIR_t2", num = 1,
          index1 = list(coords = "MASK_T2_FLAIR_t2", outline = TRUE)
)

#### 2- with simulated data ####
n0 <- 1000
n1 <- c(10,100,1000)
for(iter_n in 1:length(n1)){
  X <- c(rnorm(n0,0),rnorm(n1[iter_n],2))
  Y <- c(rep(0,n0),rep(1,n1[iter_n]))
  print(calcAUPRC(X,Y))
}

## alternative way using a performance object
perfXY <- ROCR::performance(ROCR::prediction(X,Y), x.measure = "rec", measure = "prec")
calcAUPRC(performance = perfXY, subdivisions = 10000)

Run the code above in your browser using DataLab