Learn R Programming

MachineShop (version 2.8.0)

MLControl: Resampling Controls

Description

Structures to define and control sampling methods for estimating predictive performance of models in the MachineShop package.

Usage

BootControl(samples = 25, ...)

BootOptimismControl(samples = 25, ...)

CVControl(folds = 10, repeats = 1, ...)

CVOptimismControl(folds = 10, repeats = 1, ...)

OOBControl(samples = 25, ...)

SplitControl(prop = 2/3, ...)

TrainControl(...)

MLControl( times = NULL, dist = NULL, method = NULL, seed = sample(.Machine$integer.max, 1), ... )

Arguments

samples

number of bootstrap samples.

...

arguments passed to MLControl.

folds

number of cross-validation folds (K).

repeats

number of repeats of the K-fold partitioning.

prop

proportion of cases to include in the training set (0 < prop < 1).

times, dist, method

arguments passed to predict.

seed

integer to set the seed at the start of resampling.

Value

MLControl class object.

Details

BootControl constructs an MLControl object for simple bootstrap resampling in which models are fit with bootstrap resampled training sets and used to predict the full data set (Efron and Tibshirani 1993).

BootOptimismControl constructs an MLControl object for optimism-corrected bootstrap resampling (Efron and Gong 1983, Harrell et al. 1996).

CVControl constructs an MLControl object for repeated K-fold cross-validation (Kohavi 1995). In this procedure, the full data set is repeatedly partitioned into K-folds. Within a partitioning, prediction is performed on each of the K folds with models fit on all remaining folds.

CVOptimismControl constructs an MLControl object for optimism-corrected cross-validation resampling (Davison and Hinkley 1997, eq. 6.48).

OOBControl constructs an MLControl object for out-of-bootstrap resampling in which models are fit with bootstrap resampled training sets and used to predict the unsampled cases.

SplitControl constructs an MLControl object for splitting data into a seperate trianing and test set (Hastie et al. 2009).

TrainControl constructs an MLControl object for training and performance evaluation to be performed on the same training set (Efron 1986).

The base MLControl constructor initializes a set of control parameters that are common to all resampling methods.

References

Efron B and Tibshirani RJ (1993). An Introduction to the Bootstrap. Monographs on Statistics and Applied Probability 57. Boca Raton, Florida, USA: Chapman & Hall/CRC.

Efron B and Gong G (1983). A leisurely look at the bootstrap, the jackknife, and cross-validation. The American Statistician, 37 (1): 36-48.

Harrell FE, Lee KL, and Mark DB (1996). Multivariable prognostic models: issues in developing models, evaluating assumptions and adequacy, and measuring and reducing errors. Statistics in Medicine, 15 (4): 361-387.

Kohavi R (1995). A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection. In Proceedings of the 14th International Joint Conference on Artificial Intelligence - Volume 2, 1137-43. IJCAI'95. San Francisco, CA, USA: Morgan Kaufmann Publishers Inc.

Davison AC and Hinkley DV (1997). Bootstrap Methods and Their Application. New York, NY, USA: Cambridge University Press.

Hastie T, Tibshirani R, and Friedman J (2009). The Elements of Statistical Learning: Data Mining, Inference, and Prediction, Second Edition. Springer Series in Statistics. New York, NY, USA: Springer.

Efron B (1986). How biased is the apparent error rate of a prediction rule? Journal of the American Statistical Association, 81 (394): 461-70.

See Also

resample, SelectedInput, SelectedModel, TunedInput, TunedModel

Examples

Run this code
# NOT RUN {
## Bootstrapping with 100 samples
BootControl(samples = 100)

## Optimism-corrected bootstrapping with 100 samples
BootOptimismControl(samples = 100)

## Cross-validation with 5 repeats of 10 folds
CVControl(folds = 10, repeats = 5)

## Optimism-corrected cross-validation with 5 repeats of 10 folds
CVOptimismControl(folds = 10, repeats = 5)

## Out-of-bootstrap validation with 100 samples
OOBControl(samples = 100)

## Split sample validation with 2/3 training and 1/3 testing
SplitControl(prop = 2/3)

## Training set evaluation
TrainControl()

# }

Run the code above in your browser using DataLab