Learn R Programming

PUMP package

Last updated: January 2025.

Authors:

  • Zarni Htet
  • Kristen Hunter
  • Luke Miratrix
  • Kristin Porter

Documentation

https://mdrcny.github.io/PUMP/

Using pkgdown.

Description

For randomized controlled trials (RCTs) with a single intervention being measured on multiple outcomes, researchers often apply a multiple testing procedure (such as Bonferroni or Benjamini-Hochberg) to adjust $p$-values. Such an adjustment reduces the likelihood of spurious findings, but also changes the statistical power, sometimes substantially, which reduces the probability of detecting effects when they do exist. However, this consideration is frequently ignored in typical power analyses, as existing tools do not easily accommodate the use of multiple testing procedures.

We introduce the PUMP R package as a tool for analysts to estimate statistical power, minimum detectable effect size, and sample size requirements for multi-level RCTs with multiple outcomes. Multiple outcomes are accounted for in two ways. First, power estimates from PUMP properly account for the adjustment in $p$-values from applying a multiple testing procedure. Second, as researchers change their focus from one outcome to multiple outcomes, different definitions of statistical power emerge.

PUMP allows researchers to consider a variety of definitions of power, as some may be more appropriate for the goals of their study. The package estimates power for frequentist multi-level mixed effects models, and supports a variety of commonly-used RCT designs and models and multiple testing procedures. In addition to the main functionality of estimating power, minimum detectable effect size, and sample size requirements, the package allows the user to easily explore sensitivity of these quantities to changes in underlying assumptions.

Please see the vignettes for examples of how to use this package.

Reference and support materials

The following give several tools and resources for using this package most effectively:

The hot-off-the-press version

Our package is on CRAN, but you can install the latest version on GitHub via:

devtools::install_github("https://github.com/MDRCNY/PUMP" )

The latest version has some bug fixes and extra features, and we strongly recommend using it over the CRAN version.

A small illustration

We provide below one example of using PUMP to calculate a minimium detectable effect size (MDES). The user specifies the RCT design and model (d_m), the multiple testing procedure (MTP, in this case Holm), the target power (0.8), and the type of power desired (individual power for outcome 1). The user also specifies a variety of design and model parameters, such as the number of outcomes, sample sizes at different levels, variation explained by covariates, etc.

m <- pump_mdes(
  d_m = "d3.2_m3fc2rc",         # choice of design and analysis strategy
  MTP = "HO",                   # multiple testing procedure
  target.power = 0.80,          # desired power
  power.definition = "D1indiv", # power type
  M = 5,                        # number of outcomes
  J = 3,                        # number of schools per block
  K = 21,                       # number districts
  nbar = 258,                   # average number of students per school
  Tbar = 0.50,                  # prop treated
  alpha = 0.05,                 # significance level
  numCovar.1 = 5,               # number of covariates at level 1
  numCovar.2 = 3,               # number of covariates at level 2
  R2.1 = 0.1, R2.2 = 0.7,       # explanatory power of covariates for each level
  ICC.2 = 0.05, ICC.3 = 0.4,    # intraclass correlation coefficients
  rho = 0.4 )                   # how correlated outcomes are

Copy Link

Version

Install

install.packages('PUMP')

Monthly Downloads

458

Version

1.0.4

License

GPL-3

Issues

Pull Requests

Stars

Forks

Maintainer

Luke Miratrix

Last Published

March 12th, 2025

Functions in PUMP (1.0.4)

gen_base_sim_data

Generate base simulated multi-level data (simulation function)
gen_Yobs

Generate observed outcomes (simulation function)
get_power_results

Calculates different definitions of power (support function)
get_pval_tstat

Function: get_pval_tstat extracts p-value and t statistics from a given fitted model.
gen_corr_matrix

Generate correlation matrix (simulation function)
gen_sim_data

Generate simulated multi-level data (simulation function)
plot.pumpgridresult

Plot a pumpgridresult object (result function)
gen_cluster_ids

Generates school and district assignments (simulation function)
makelist_samp

Convert multi-outcome data structure to dataframe for each outcome.
gen_cov_matrix

generate covariance matrix between two variables
get_adjp_minp

Helper function for Westfall Young
get_rawpt

Function: get_rawpt
plot.pumpgridresult.mdes

Plot a grid pump mdes object
plot.pumpgridresult.power

Plot a pump grid power object
interacted_linear_estimators

Interacted linear regression models
make_model

Function: make.model
print_search

Print the search history of a pump result object (result function)
parse_power_definition

Parse the power definition
pump_power

Estimate power across definitions (core function)
pump_info

Provides details about supported package features (core function)
pump_power_exact

Calculate power theoretically for M=1 situations
parse_d_m

Return characteristics of a given context/d_m code (support function)
plot_power_curve

Examine a power curve (result function)
plot_power_search

Examine search path of a power search (result function)
optimize_power

Optimizes power to help in search for MDES or SS
pump_power_grid

Run pump_power on varying values of parameters (grid function)
power_curve

Obtain a power curve for a range of sample size or MDES values
plot.pumpgridresult.sample

Plot a grid pump sample object
pump_sample

Estimate the required sample size (core function)
pump_sample_grid

Run pump_sample on varying values of parameters (grid function)
print_context

Print context (design, model, parameter values) of pumpresult or pumpgridresult (result function)
pump_sample_raw

Calculating Needed Sample Size for Raw (Unadjusted) Power
pumpresult

pumpresult object for results of power calculations
plot.pumpresult

Plot a pumpresult object (result function)
pumpgridresult

Result object for results of grid power calculations
pump_mdes

Estimate the minimum detectable effect size (MDES) (core function)
pump_mdes_grid

Run pump_mdes on varying values of parameters (grid function)
run_grid

Run grid across any of the core pump functions
validate_d_m

Validate d_m string
update_grid

Update a single pump call to a grid call (grid function)
validate_inputs

Validates user inputs
transpose_power_table

Convert power table from wide to long (result function)
update.pumpgridresult

Update a pump grid call, tweaking some parameters (core function)
update.pumpresult

Update a pump call, tweaking some parameters (core function)
strip_SEs

Remove SE and df columns from (wide) power table
setup_default_parallel_plan

Setup parallel processing
PUMP-package

PUMP: Power Under Multiplicity Project
calc_df

Calculate degrees of freedom (support function)
calc_K

Calculates K, the number of districts
calc_J

This function calculates needed J to achieve a given (unadjusted) power
calc_nbar

This function calculates needed nbar to achieve a given power
calc_pval

Calculates p-values from t-values
adjp_wyss

Westfall-Young Single Step Adjustment Function
calc_SE

Computes Q_m, the standard error of the effect size estimate
calc_MT

Caculate multiplier for MDE calculation
adjp_wysd

Westfall Young Step Down Function
check_cor

Check correlation of test statistics (simulation function)
convert_params

Converts model params into DGP params (simulation function)
comp_rawp_ss

Helper function for Westfall Young Single Step
find_best

Determine next point to check for correct power level.
fit_bounded_logistic

Fit a bounded logistic curve
gen_RE_cov_matrix

generate a parameterized covariance matrix from the provided 3 blocks
comp_rawp_sd

Helper function for Westfall Young Step Down
estimate_power_curve

Calculate a power curve for sample size or mdes.
gen_T.x

Generate treatment assignment vector (simulation function)