This function is equivalent to cutpointr
but takes only quoted arguments
for x
, class
and subgroup
. This was useful before
cutpointr
supported tidyeval.
cutpointr_(
data,
x,
class,
subgroup = NULL,
method = maximize_metric,
metric = sum_sens_spec,
pos_class = NULL,
neg_class = NULL,
direction = NULL,
boot_runs = 0,
boot_stratify = FALSE,
use_midpoints = FALSE,
break_ties = median,
na.rm = FALSE,
allowParallel = FALSE,
silent = FALSE,
tol_metric = 1e-06,
...
)
A data.frame with the data needed for x, class and optionally subgroup.
(character) The variable name to be used for classification, e.g. predictions or test values.
(character) The variable name indicating class membership.
(character) The variable name of an additional covariate that identifies subgroups. Separate optimal cutpoints will be determined per group.
(function) A function for determining cutpoints. Can be user supplied or use some of the built in methods. See details.
(function) The function for computing a metric when using maximize_metric or minimize_metric as method and and for the out-of-bag values during bootstrapping. A way of internally validating the performance. User defined functions can be supplied, see details.
(optional) The value of class that indicates the positive class.
(optional) The value of class that indicates the negative class.
(character, optional) Use ">=" or "<=" to indicate whether x is supposed to be larger or smaller for the positive class.
(numerical) If positive, this number of bootstrap samples will be used to assess the variability and the out-of-sample performance.
(logical) If the bootstrap is stratified, bootstrap samples are drawn separately in both classes and then combined, keeping the proportion of positives and negatives constant in every resample.
(logical) If TRUE (default FALSE) the returned optimal cutpoint will be the mean of the optimal cutpoint and the next highest observation (for direction = ">=") or the next lowest observation (for direction = "<=") which avoids biasing the optimal cutpoint.
If multiple cutpoints are found, they can be summarized using this function, e.g. mean or median. To return all cutpoints use c as the function.
(logical) Set to TRUE (default FALSE) to keep only complete cases of x, class and subgroup (if specified). Missing values with na.rm = FALSE will raise an error.
(logical) If TRUE, the bootstrapping will be parallelized using foreach. A local cluster, for example, should be started manually beforehand.
(logical) If TRUE suppresses all messages.
All cutpoints will be returned that lead to a metric
value in the interval [m_max - tol_metric, m_max + tol_metric] where
m_max is the maximum achievable metric value. This can be used to return
multiple decent cutpoints and to avoid floating-point problems. Not supported
by all method
functions, see details.
Further optional arguments that will be passed to method. minimize_metric and maximize_metric pass ... to metric.
# NOT RUN {
library(cutpointr)
## Optimal cutpoint for dsi
data(suicide)
opt_cut <- cutpointr_(suicide, "dsi", "suicide")
opt_cut
summary(opt_cut)
plot(opt_cut)
predict(opt_cut, newdata = data.frame(dsi = 0:5))
# }
Run the code above in your browser using DataLab