Learn R Programming

cutpointr (version 1.1.2)

maximize_metric: Optimize a metric function in binary classification

Description

Given a function for computing a metric in metric_func, these functions maximize or minimize that metric by selecting an optimal cutpoint. The metric function should accept the following inputs:

  • tp: vector of number of true positives

  • fp: vector of number of false positives

  • tn: vector of number of true negatives

  • fn: vector of number of false negatives

Usage

maximize_metric(
  data,
  x,
  class,
  metric_func = youden,
  pos_class = NULL,
  neg_class = NULL,
  direction,
  tol_metric,
  use_midpoints,
  ...
)

minimize_metric( data, x, class, metric_func = youden, pos_class = NULL, neg_class = NULL, direction, tol_metric, use_midpoints, ... )

Arguments

data

A data frame or tibble in which the columns that are given in x and class can be found.

x

(character) The variable name to be used for classification, e.g. predictions or test values.

class

(character) The variable name indicating class membership.

metric_func

(function) A function that computes a metric to be maximized. See description.

pos_class

The value of class that indicates the positive class.

neg_class

The value of class that indicates the negative class.

direction

(character) Use ">=" or "<=" to select whether an x value >= or <= the cutoff predicts the positive class.

tol_metric

All cutpoints will be returned that lead to a metric value in the interval [m_max - tol_metric, m_max + tol_metric] where m_max is the maximum achievable metric value. This can be used to return multiple decent cutpoints and to avoid floating-point problems.

use_midpoints

(logical) If TRUE (default FALSE) the returned optimal cutpoint will be the mean of the optimal cutpoint and the next highest observation (for direction = ">") or the next lowest observation (for direction = "<") which avoids biasing the optimal cutpoint.

...

Further arguments that will be passed to metric_func.

Value

A tibble with the columns optimal_cutpoint, the corresponding metric value and roc_curve, a nested tibble that includes all possible cutoffs and the corresponding numbers of true and false positives / negatives and all corresponding metric values.

Details

The above inputs are arrived at by using all unique values in x, Inf, or -Inf as possible cutpoints for classifying the variable in class.

See Also

Other method functions: maximize_boot_metric(), maximize_gam_metric(), maximize_loess_metric(), maximize_spline_metric(), oc_manual(), oc_mean(), oc_median(), oc_youden_kernel(), oc_youden_normal()

Examples

Run this code
# NOT RUN {
cutpointr(suicide, dsi, suicide, method = maximize_metric, metric = accuracy)
cutpointr(suicide, dsi, suicide, method = minimize_metric, metric = abs_d_sens_spec)
# }

Run the code above in your browser using DataLab