Learn R Programming

cutpointr (version 1.1.2)

maximize_loess_metric: Optimize a metric function in binary classification after LOESS smoothing

Description

Given a function for computing a metric in metric_func, these functions smooth the function of metric value per cutpoint using LOESS, then maximize or minimize the metric by selecting an optimal cutpoint. For further details on the LOESS smoothing see ?fANCOVA::loess.as. The metric function should accept the following inputs:

  • tp: vector of number of true positives

  • fp: vector of number of false positives

  • tn: vector of number of true negatives

  • fn: vector of number of false negatives

Usage

maximize_loess_metric(
  data,
  x,
  class,
  metric_func = youden,
  pos_class = NULL,
  neg_class = NULL,
  direction,
  criterion = "aicc",
  degree = 1,
  family = "symmetric",
  user.span = NULL,
  tol_metric,
  use_midpoints,
  ...
)

minimize_loess_metric( data, x, class, metric_func = youden, pos_class = NULL, neg_class = NULL, direction, criterion = "aicc", degree = 1, family = "symmetric", user.span = NULL, tol_metric, use_midpoints, ... )

Arguments

data

A data frame or tibble in which the columns that are given in x and class can be found.

x

(character) The variable name to be used for classification, e.g. predictions or test values.

class

(character) The variable name indicating class membership.

metric_func

(function) A function that computes a metric to be maximized. See description.

pos_class

The value of class that indicates the positive class.

neg_class

The value of class that indicates the negative class.

direction

(character) Use ">=" or "<=" to select whether an x value >= or <= the cutoff predicts the positive class.

criterion

the criterion for automatic smoothing parameter selection: "aicc" denotes bias-corrected AIC criterion, "gcv" denotes generalized cross-validation.

degree

the degree of the local polynomials to be used. It can be 0, 1 or 2.

family

if "gaussian" fitting is by least-squares, and if "symmetric" a re-descending M estimator is used with Tukey's biweight function.

user.span

The user-defined parameter which controls the degree of smoothing

tol_metric

All cutpoints will be returned that lead to a metric value in the interval [m_max - tol_metric, m_max + tol_metric] where m_max is the maximum achievable metric value. This can be used to return multiple decent cutpoints and to avoid floating-point problems.

use_midpoints

(logical) If TRUE (default FALSE) the returned optimal cutpoint will be the mean of the optimal cutpoint and the next highest observation (for direction = ">") or the next lowest observation (for direction = "<") which avoids biasing the optimal cutpoint.

...

Further arguments that will be passed to metric_func or the loess smoother.

Value

A tibble with the columns optimal_cutpoint, the corresponding metric value and roc_curve, a nested tibble that includes all possible cutoffs and the corresponding numbers of true and false positives / negatives and all corresponding metric values.

Details

The above inputs are arrived at by using all unique values in x, Inf, and -Inf as possible cutpoints for classifying the variable in class.

See Also

Other method functions: maximize_boot_metric(), maximize_gam_metric(), maximize_metric(), maximize_spline_metric(), oc_manual(), oc_mean(), oc_median(), oc_youden_kernel(), oc_youden_normal()

Examples

Run this code
# NOT RUN {
oc <- cutpointr(suicide, dsi, suicide, gender, method = maximize_loess_metric,
criterion = "aicc", family = "symmetric", degree = 2, user.span = 0.7,
metric = accuracy)
plot_metric(oc)
oc <- cutpointr(suicide, dsi, suicide, gender, method = minimize_loess_metric,
criterion = "aicc", family = "symmetric", degree = 2, user.span = 0.7,
metric = misclassification_cost, cost_fp = 1, cost_fn = 10)
plot_metric(oc)
# }

Run the code above in your browser using DataLab