Partitions the data set into folds. Stratification, if requested, is done by the
best algorithm, i.e. the one with the best performance. The distribution of the
best algorithms in each fold will be approximately the same. The folds are
assembled into training and test sets by combining $n-1$ folds for training and
using the remaining fold for testing. The sets of indices are added to the
original data set and returned.
If the data set has train and test partitions already, they are overwritten.