Learn R Programming

rsample

Overview

The rsample package provides functions to create different types of resamples and corresponding classes for their analysis. The goal is to have a modular set of methods that can be used for:

  • resampling for estimating the sampling distribution of a statistic
  • estimating model performance using a holdout set

The scope of rsample is to provide the basic building blocks for creating and analyzing resamples of a data set, but this package does not include code for modeling or calculating statistics. The Working with Resample Sets vignette gives a demonstration of how rsample tools can be used when building models.

Note that resampled data sets created by rsample are directly accessible in a resampling object but do not contain much overhead in memory. Since the original data is not modified, R does not make an automatic copy.

For example, creating 50 bootstraps of a data set does not create an object that is 50-fold larger in memory:

library(rsample)
library(mlbench)

data(LetterRecognition)
lobstr::obj_size(LetterRecognition)
#> 2,644,640 B

set.seed(35222)
boots <- bootstraps(LetterRecognition, times = 50)
lobstr::obj_size(boots)
#> 6,686,776 B

# Object size per resample
lobstr::obj_size(boots)/nrow(boots)
#> 133,735.5 B

# Fold increase is <<< 50
as.numeric(lobstr::obj_size(boots)/lobstr::obj_size(LetterRecognition))
#> [1] 2.528426

Created on 2022-02-28 by the reprex package (v2.0.1)

The memory usage for 50 bootstrap samples is less than 3-fold more than the original data set.

Installation

To install it, use:

install.packages("rsample")

And the development version from GitHub with:

# install.packages("pak")
pak::pak("rsample")

Contributing

This project is released with a Contributor Code of Conduct. By contributing to this project, you agree to abide by its terms.

Copy Link

Version

Install

install.packages('rsample')

Monthly Downloads

55,286

Version

1.3.0

License

MIT + file LICENSE

Issues

Pull Requests

Stars

Forks

Maintainer

Hannah Frick

Last Published

April 2nd, 2025

Functions in rsample (1.3.0)

new_rset

Constructor for new rset objects
inner_split

Inner split of the analysis set for fitting a post-processor
initial_validation_split

Create an Initial Train/Validation/Test Split
labels.rset

Find Labels from rset Object
int_pctl

Bootstrap confidence intervals
manual_rset

Manual resampling
loo_cv

Leave-One-Out Cross-Validation
make_strata

Create or Modify Stratification Variables
labels.rsplit

Find Labels from rsplit Object
reverse_splits

Reverse the analysis and assessment sets
nested_cv

Nested or Double Resampling
mc_cv

Monte Carlo Cross-Validation
rolling_origin

Rolling Origin Forecast Resampling
group_vfold_cv

Group V-Fold Cross-Validation
populate

Add Assessment Indices
initial_split

Simple Training/Test Set Splitting
permutations

Permutation sampling
reexports

Objects exported from other packages
slide-resampling

Time-based Resampling
group_bootstraps

Group Bootstraps
group_mc_cv

Group Monte Carlo Cross-Validation
tidy.rsplit

Tidy Resampling Object
rsample-dplyr

Compatibility with dplyr
rsample-package

rsample: General Resampling Infrastructure
vfold_cv

V-Fold Cross-Validation
reshuffle_rset

"Reshuffle" an rset to re-generate a new rset with the same parameters
reg_intervals

A convenience function for confidence intervals with linear-ish parametric models
rsample2caret

Convert Resampling Objects to Other Formats
rset_reconstruct

Extending rsample with new rset subclasses
make_splits

Constructors for split objects
validation_set

Create a Validation Split for Tuning
make_groups

Make groupings for grouped rsplits
validation_split

Create a Validation Set
apparent

Sampling for the Apparent Error Rate
.get_fingerprint

Obtain a identifier for the resamples
as.data.frame.rsplit

Convert an rsplit object to a data frame
complement

Determine the Assessment Samples
form_pred

Extract Predictor Names from Formula or Terms
bootstraps

Bootstrap Sampling
add_resample_id

Augment a data set with resampling identifiers
clustering_cv

Cluster Cross-Validation
get_rsplit

Retrieve individual rsplits objects from an rset
.get_split_args

Get the split arguments from an rset