Learn R Programming

precmed

Overview

precmed was developed to help researchers with the implementation of precision medicine in R. A key objective of precision medicine is to determine the optimal treatment separately for each patient instead of applying a common treatment to all patients. Personalizing treatment decisions becomes particularly relevant when treatment response differs across patients, or when patients have different preferences about benefits and harms. This package offers statistical methods to develop and validate prediction models for estimating individualized treatment effects. These treatment effects are also known as the conditional average treatment effects (CATEs) and describe how different subgroups of patients respond to the same treatment. Presently, precmed focuses on the personalization of two competitive treatments using randomized data from a clinical trial (Zhao et al. 2013) or using real-world data (RWD) from a non-randomized study (Yadlowsky et al. 2020).

Installation

The precmed package can be installed from CRAN as follows:

install.packages("precmed")

The latest version can be installed from GitHub as follows:

install.packages("devtools")
devtools::install_github(repo = "smartdata-analysis-and-statistics/precmed")

Package capabilities

The main functions in the precmed package are:

FunctionDescription
catefit()Estimation of the conditional average treatment effect (CATE)
atefit()Doubly robust estimator for the average treatment effect (ATE)
catecv()Development and cross-validation of the CATE
abc()Compute the area between the average treatment difference curve of competing models for the CATE (Zhao et al. 2013)
plot()Two side-by-side line plots of validation curves from the precmed object
boxplot()Plot the proportion of subjects with an estimated treatment effect no less than $c$ over a range of values for $c$ (Zhao et al. 2013).

For more info: https://smartdata-analysis-and-statistics.github.io/precmed/

Recommended workflow

We recommend the following workflow to develop a model for estimating the CATE in order to identify treatment effect heterogeneity:

  1. Compare up to five modelling approaches (e.g., Poisson regression, boosting) for estimating the CATE using cross-validation through catecv.
  2. Select the best modelling approach using 3 metrics:
    • Compare the steepness of the validation curves in the validation samples across methods using plot(). Two side-by-side plots are generated, visualizing the estimated average treatment effects in a series of nested subgroups. On the left side the curve is shown for the training set, and on the right side the curve is shown for the validation set. Each line in the plots represents one scoring method (e.g., boosting, randomForest) specified under the argument score.method.
    • The area between curves (ABC) using abc quantifies a model’s ability to capture treatment effect heterogeneity. Higher ABC values are preferable as they indicate that more treatment effect heterogeneity is captured by the scoring method.
    • Compare the distribution of the estimated ATE across different levels of the CATE score percentiles using boxplot().
  3. Apply the best modelling approach in the original data or in a new external dataset using catefit().
  4. Optional. Use atefit() to estimate ATE between 2 treatment groups with a doubly robust estimator and estimate the variability of the ATE with a bootstrap approach.

In the vignettes, we will adopt a different workflow to gradually expose the user from simple to more complex methods.

User input

When applying catefit() or catecv(), the user has to (at least) input:

  • response: type of outcome/response (either count or survival)
  • data: a data frame with individual patient data
  • score.method: methods to estimate the CATE (e.g., boosting, poisson, twoReg, contrastReg)
  • cate.model: a formula describing the outcome model (e.g., outcome ~ age + gender + previous_treatment)
  • ps.model: a formula describing the propensity score model to adjust for confounding (e.g., treatment ~ age + previous_treatment)

Vignettes

  1. Examples with count outcome of the entire workflow
  2. Examples with survival outcome of the entire workflow
  3. Additional examples for the precmed package
  4. Theoretical details

References

Yadlowsky, Steve, Fabio Pellegrini, Federica Lionetto, Stefan Braune, and Lu Tian. 2020. “Estimation and Validation of Ratio-Based Conditional Average Treatment Effects Using Observational Data.” Journal of the American Statistical Association, 1–18.

Zhao, Lihui, Lu Tian, Tianxi Cai, Brian Claggett, and Lee-Jen Wei. 2013. “Effectively Selecting a Target Population for a Future Comparative Study.” Journal of the American Statistical Association 108 (502): 527–39. https://doi.org/10.1080/01621459.2013.770705.

Copy Link

Version

Install

install.packages('precmed')

Monthly Downloads

206

Version

1.1.0

License

Apache License (== 2.0)

Issues

Pull Requests

Stars

Forks

Maintainer

Thomas Debray

Last Published

October 5th, 2024

Functions in precmed (1.1.0)

catecv

Cross-validation of the conditional average treatment effect (CATE) score for count, survival or continuous outcomes
catecvcount

Cross-validation of the conditional average treatment effect (CATE) score for count outcomes
boxplot.precmed

A set of box plots of estimated ATEs from the "precmed" object
catefit

Estimation of the conditional average treatment effect (CATE) score for count, survival and continuous data
catecvsurv

Cross-validation of the conditional average treatment effect (CATE) score for survival outcomes
balancemean.split

Split the given dataset into balanced training and validation sets (within a pre-specified tolerance) Balanced means 1) The ratio of treated and controls is maintained in the training and validation sets 2) The covariate distributions are balanced between the training and validation sets
catefitcount

Estimation of the conditional average treatment effect (CATE) score for count data
catecvmean

Cross-validation of the conditional average treatment effect (CATE) score for continuous outcomes
catefitmean

Estimation of the conditional average treatment effect (CATE) score for continuous data
balancesurv.split

Split the given time-to-event dataset into balanced training and validation sets (within a pre-specified tolerance) Balanced means 1) The ratio of treated and controls is maintained in the training and validation sets 2) The covariate distributions are balanced between the training and validation sets
data.preproc

Data preprocessing Apply at the beginning of pmcount() and cvcount(), after arg.checks()
cox.rmst

Estimate restricted mean survival time (RMST) based on Cox regression model
drsurv

Doubly robust estimator of the average treatment effect with Cox model for survival data
data.preproc.surv

Data preprocessing Apply at the beginning of catefitcount(), catecvcount(), catefitsurv(), and catecvsurv(), after arg.checks()
drmean

Doubly robust estimator of the average treatment effect for continuous data
countExample

Simulated data with count outcome
data.preproc.mean

Data preprocessing Apply at the beginning of catefitmean() and catecvmean(), after arg.checks()
drcount

Doubly robust estimator of the average treatment effect for count data
estcount.bilevel.subgroups

Estimate the Average Treatment Effect of the log risk ratio in multiple bi-level subgroups defined by the proportions
catefitsurv

Estimation of the conditional average treatment effect (CATE) score for survival data
estcount.multilevel.subgroup

Estimate the ATE of the log RR ratio in one multilevel subgroup defined by the proportions
estmean.multilevel.subgroup

Estimate the ATE of the mean difference in one multilevel subgroup defined by the proportions
intxcount

Estimate the CATE model using specified scoring methods
estsurv.bilevel.subgroups

Estimate the ATE of the RMTL ratio and unadjusted hazard ratio in multiple bi-level subgroups defined by the proportions
estsurv.multilevel.subgroups

Estimate the ATE of the RMTL ratio and unadjusted hazard ratio in one multilevel subgroup defined by the proportions
intxmean

Estimate the CATE model using specified scoring methods
estmean.bilevel.subgroups

Estimate the ATE of the mean difference in multiple bi-level subgroups defined by the proportions
glm.ps

Propensity score estimation with LASSO
generate_kfold_indices

Generate K-fold Indices for Cross-Validation
glm.simplereg.ps

Propensity score estimation with a linear model
onearmglmcount.dr

Doubly robust estimators of the coefficients in the two regression
meanCatch

Catch errors and warnings when estimating the ATEs in the nested subgroup for continuous data
intxsurv

Estimate the CATE model using specified scoring methods for survival outcomes
meanExample

Simulated data with a continuous outcome
print.atefit

Print function for atefit
plot.atefit

Histogram of bootstrap estimates
onearmglmmean.dr

Doubly robust estimators of the coefficients in the two regression
plot.precmed

Two side-by-side line plots of validation curves from the "precmed" object
ipcw.surv

Probability of being censored
onearmsurv.dr

Doubly robust estimators of the coefficients in the two regression
twoarmsurv.dr

Doubly robust estimators of the coefficients in the contrast regression as well as their covariance matrix and convergence information
twoarmglmcount.dr

Doubly robust estimators of the coefficients in the contrast regression as well as their covariance matrix and convergence information
scoremean

Calculate the CATE score given the baseline covariates for specified scoring method methods
print.catefit

Print function for atefit
scorecount

Calculate the log CATE score given the baseline covariates and follow-up time for specified scoring method methods
twoarmglmmean.dr

Doubly robust estimators of the coefficients in the contrast regression as well as their covariance matrix
scoresurv

Calculate the log CATE score given the baseline covariates and follow-up time for specified scoring method methods for survival outcomes
survivalExample

Simulated data with survival outcome
survCatch

Catch errors and warnings when estimating the ATEs in the nested subgroup
arg.checks

Check arguments Catered to all types of outcome Apply at the beginning of pmcount(), cvcount(), drcount.inference(), catefitsurv(), catecvsurv(), and drsurv.inference()
balance.split

Split the given dataset into balanced training and validation sets (within a pre-specified tolerance) Balanced means 1) The ratio of treated and controls is maintained in the training and validation sets 2) The covariate distributions are balanced between the training and validation sets
atefitmean

Doubly robust estimator of and inference for the average treatment effect for continuous data
auc

Compute the area under the curve using linear or natural spline interpolation
atefitcount

Doubly robust estimator of and inference for the average treatment effect for count data
atefitsurv

Doubly robust estimator of and inference for the average treatment effect for survival data
abc

Compute the area between curves from the "precmed" object
atefit

Doubly robust estimator of and inference for the average treatment effect for count, survival and continuous data
arg.checks.common

Check arguments that are common to all types of outcome USed inside arg.checks()
abc.precmed

Compute the area between curves from the "precmed" object