Learn R Programming

mlr3 (version 0.23.0)

mlr_resamplings_custom_cv: Custom Cross-Validation

Description

Splits data into training and test sets in a cross-validation fashion based on a user-provided categorical vector. This vector can be passed during instantiation either via an arbitrary factor f with the same length as task$nrow, or via a single string col referring to a column in the task.

An alternative but equivalent approach using leave-one-out resampling is showcased in the examples of mlr_resamplings_loo.

Arguments

Dictionary

This Resampling can be instantiated via the dictionary mlr_resamplings or with the associated sugar function rsmp():

mlr_resamplings$get("custom_cv")
rsmp("custom_cv")

Super class

mlr3::Resampling -> ResamplingCustomCV

Active bindings

iters

(integer(1))
Returns the number of resampling iterations, depending on the values stored in the param_set.

Methods

Inherited methods


Method new()

Creates a new instance of this R6 class.

Usage

ResamplingCustomCV$new()


Method instantiate()

Instantiate this Resampling as cross-validation with custom splits.

Usage

ResamplingCustomCV$instantiate(task, f = NULL, col = NULL)

Arguments

task

Task
Used to extract row ids.

f

(factor() | character())
Vector of type factor or character with the same length as task$nrow. Row ids are split on this vector, each distinct value results in a fold. Empty factor levels are dropped and row ids corresponding to missing values are removed, c.f. split().

col

(character(1))
Name of the task column to use for splitting. Alternative and mutually exclusive to providing the factor levels as a vector via parameter f.


Method clone()

The objects of this class are cloneable with this method.

Usage

ResamplingCustomCV$clone(deep = FALSE)

Arguments

deep

Whether to make a deep clone.

See Also

Other Resampling: Resampling, mlr_resamplings, mlr_resamplings_bootstrap, mlr_resamplings_custom, mlr_resamplings_cv, mlr_resamplings_holdout, mlr_resamplings_insample, mlr_resamplings_loo, mlr_resamplings_repeated_cv, mlr_resamplings_subsampling

Examples

Run this code
# Create a task with 10 observations
task = tsk("penguins")
task$filter(1:10)

# Instantiate Resampling:
custom_cv = rsmp("custom_cv")
f = factor(c(rep(letters[1:3], each = 3), NA))
custom_cv$instantiate(task, f = f)
custom_cv$iters # 3 folds

# Individual sets:
custom_cv$train_set(1)
custom_cv$test_set(1)

# Disjunct sets:
intersect(custom_cv$train_set(1), custom_cv$test_set(1))

Run the code above in your browser using DataLab