Learn R Programming

utiml (version 0.1.4)

create_kfold_partition: Create the k-folds partition based on the specified algorithm

Description

This method create the kFoldPartition object, from it is possible create the dataset partitions to train, test and optionally to validation.

Usage

create_kfold_partition(mdata, k = 10, method = c("random", "iterative",
  "stratified"))

Arguments

mdata

A mldr dataset.

k

The number of desirable folds. (Default: 10)

method

The method to split the data. The default methods are:

random

Split randomly the folds.

iterative

Split the folds considering the labels proportions individually. Some specific label can not occurs in all folds.

stratified

Split the folds considering the labelset proportions.

You can also create your own partition method. See the note and example sections to more details. (Default: "random")

Value

An object of type kFoldPartition.

References

Sechidis, K., Tsoumakas, G., & Vlahavas, I. (2011). On the stratification of multi-label data. In Proceedings of the Machine Learning and Knowledge Discovery in Databases - European Conference, ECML PKDD (pp. 145-158).

See Also

How to create the datasets from folds

Other sampling: create_holdout_partition, create_random_subset, create_subset

Examples

Run this code
# NOT RUN {
k10 <- create_kfold_partition(toyml, 10)
k5 <- create_kfold_partition(toyml, 5, "stratified")

sequencial_split <- function (mdata, r) {
 S <- list()

 amount <- trunc(r * mdata$measures$num.instances)
 indexes <- c(0, cumsum(amount))
 indexes[length(r)+1] <- mdata$measures$num.instances

 S <- lapply(seq(length(r)), function (i) {
   seq(indexes[i]+1, indexes[i+1])
 })

 S
}
k3 <- create_kfold_partition(toyml, 3, "sequencial_split")
# }

Run the code above in your browser using DataLab