Learn R Programming

DecisionCurve (version 1.4)

decision_curve: Calculate decision curves

Description

This function calculates decision curves, which are estimates of the standardized net benefit by the probability threshold used to categorize observations as 'high risk.' Curves can be estimated using data from an observational cohort (default), or from case-control studies when an estimate of the population outcome prevalence is available. Confidence intervals calculated using the bootstrap are calculated as well. Once this function is called, use plot_decision_curve or summary to plot or view the curves, respectively.

Usage

decision_curve(formula, data, family = binomial(link = "logit"),
  policy = c("opt-in", "opt-out"), fitted.risk = FALSE,
  thresholds = seq(0, 1, by = 0.01), confidence.intervals = 0.95,
  bootstraps = 500, study.design = c("cohort", "case-control"),
  population.prevalence)

Arguments

formula

an object of class 'formula' of the form outcome ~ predictors, giving the prediction model to be fitted using glm. The outcome must be a binary variable that equals '1' for cases and '0' for controls.

data

data.frame containing outcome and predictors. Missing data on any of the predictors will cause the entire observation to be removed.

family

a description of the error distribution and link function to pass to 'glm' used for model fitting. Defaults to binomial(link = 'logit') for logistic regression.

policy

Either 'opt-in' (default) or 'opt-out', describing the type of policy for which to report the net benefit. A policy is 'opt-in' when the standard-of-care for a population is to assign a particular 'treatment' to no one. Clinicians then use a risk model to categorize patients as 'high-risk', with the recommendation to treat high-risk patients with some intervention. Alternatively, an 'opt-out' policy is applicable to contexts where the standard-of-care is to recommend a treatment to an entire patient population. The potential use of a risk model in this setting is to identify patients who are 'low-risk' and recommend that those patients 'opt-out' of treatment.

fitted.risk

logical (default FALSE) indicating whether the predictor provided are estimated risks from an already established model. If set to TRUE, no model fitting will be done and all estimates will be conditional on the risks provided. Risks must fall between 0 and 1.

thresholds

Numeric vector of high risk thresholds to use when plotting and calculating net benefit values.

confidence.intervals

Numeric (default 0.95 for 95% confidence bands) level of bootstrap confidence intervals to plot. Set as NA or 'none' to remove confidence intervals. See details for more information.

bootstraps

Number of bootstrap replicates to use to calculate confidence intervals (default 500).

study.design

Either 'cohort' (default) or 'case-control' describing the study design used to obtain data. See details for more information.

population.prevalence

Outcome prevalence rate in the population used to calculate decision curves when study.design = 'case-control'.

Value

List with components

  • derived.data: A data frame in long form showing the following for each predictor and each 'threshold', 'FPR':false positive rate, 'TPR': true positive rate, 'NB': net benefit, 'sNB': standardized net benefit, 'rho': outcome prevalence, 'prob.high.risk': percent of the population considered high risk. DP': detection probability = TPR*rho, 'model': name of prediction model or 'all' or 'none', cost.benefit.ratio, and 'xx_lower', 'xx_upper': the lower and upper confidence bands for all measures (if calculated).

  • confidence.intervals: Level of confidence intervals returned.

  • call: matched function call.

Details

Confidence intervals for (standardized) net benefit are calculated pointwise at each risk threshold. For when data come from an observational cohort, bootstrap sampling is done without stratifying on outcome, so disease prevalence varies within bootstrap samples. For case-control data, bootstrap sampling is done stratified on outcome.

See Also

summary.decision_curve, cv_decision_curve, Add_CostBenefit_Axis

Examples

Run this code
# NOT RUN {
#helper function
expit <- function(xx) exp(xx)/ (1+exp(xx))

#load simulated cohort data
data(dcaData)
baseline.model <- decision_curve(Cancer~Age + Female + Smokes,
                                data = dcaData,
                                thresholds = seq(0, .4, by = .01),
                                study.design = 'cohort',
                                bootstraps = 10) #number of bootstraps should be higher

full.model <- decision_curve(Cancer~Age + Female + Smokes + Marker1 + Marker2,
                            data = dcaData,
                            thresholds = seq(0, .4, by = .01),
                            bootstraps = 10)

#simulated case-control data with same variables as above
data(dcaData_cc)

table(dcaData_cc$Cancer)

#estimated from the population where the
#case-control sample comes from.
population.rho = 0.11

full.model_cc <- decision_curve(Cancer~Age + Female + Smokes + Marker1 + Marker2,
                               data = dcaData,
                               thresholds = seq(0, .4, by = .01),
                               bootstraps = 10,
                               study.design = 'case-control',
                               population.prevalence = population.rho)

#estimate the net benefit for an 'opt-out' policy.
nb.opt.out  <- decision_curve(Cancer~Age + Female + Smokes + Marker1 + Marker2,
                            data = dcaData,
                            policy = 'opt-out',
                            thresholds = seq(0, .4, by = .01),
                            bootstraps = 10)


# }

Run the code above in your browser using DataLab