tune: Parameter tuning of fuctions using grid search

Description

This generic function tunes hyperparameters of statistical methods using a grid search over supplied parameter ranges.

Usage

tune(method, train.x, train.y = NULL, data = list(), validation.x =
     NULL, validation.y = NULL, ranges, random = FALSE, nrepeat = 1,
     repeat.aggregate = min, sampling = c("cross", "fix", "bootstrap"),
     sampling.aggregate = mean, cross = 10, fix = 2/3, nboot = 10,
     boot.size = 9/10, predict.func = predict, best.model = TRUE,
     performances = TRUE, ...)

Arguments

method

function to be tuned.

train.x

either a formula or a matrix of predictors.

train.y

the response variable if train.x is a predictor matrix. Ignored if train.x is a formula.

data

data, if a formula interface is used. Ignored, if predictor matrix and response are supplied directly.

validation.x

an optional validation set. Depending on whether a formula interface is used or not, the response can be included in validation.x or separately speciefied using validation.y.

validation.y

if no formula interface is used, the response of the (optional) validation set.

ranges

a named list of parameter vectors spanning the sampling space. The vectors will usually be created by seq.

random

if an integer value is specified, random parameter vectors are drawn from the parameter space.

nrepeat

specifies how often training shall be repeated.

repeat.aggregate

function for aggregating the repeated training results.

sampling

sampling scheme. If sampling = "cross", a cross-times cross validation is performed. If

sampling
      = "boot"

, nboot training sets of size boot.size (part) are sampled from the sup

sampling.aggregate

function for aggregating the training results on the generated training samples.

cross

number of partitions for cross-validation.

fix

part of the data used for training in fixed sampling.

nboot

number of bootstrap replications.

boot.size

size of the bootstrap samples.

predict.func

optional predict function, if the standard predict behaviour is inadequate.

best.model

if TRUE, the best model is trained and returned (the best parameter set is used for training on the complete training set).

performances

if TRUE, the performance results for all parameter combinations are returned.

...

Further parameters passed to the training functions.

Value

An object of class tune, including the components:
best.parametersa 1 x k data frame, k number of parameters.
best.performancebest achieved performance.
performancesif requested, a data frame of all parameter combinations along with the corresponding performance results.
if requested, the model trained on the complete training data using the best parameter combination.

Examples

Run this code

data(iris)
  ## tune `svm' for classification with RBF-kernel (default in svm),
  ## using one split for training/validation set
  
  obj <- tune(svm, Species~., data = iris, sampling = "fix",
              ranges = list(gamma = 2^(-1:1), cost = 2^(2:4))
             )

  ## alternatively:
  ## obj <- tune.svm(Species~., data = iris, gamma = 2^(-1:1), cost = 2^(2:4))

  summary(obj)
  plot(obj)

  ## tune `knn' using a convenience function; this time with the
  ## conventional interface and bootstrap sampling:
  x <- iris[,-5]
  y <- iris[,5]
  obj2 <- tune.knn(x, y, k = 1:5, sampling = "boot")
  summary(obj2)
  plot(obj2)

  ## tune `rpart' for regression, using 10-fold cross validation (default)
  data(mtcars)
  obj3 <- tune.rpart(mpg~., data = mtcars, minsplit = c(5,10,15))
  summary(obj3)
  plot(obj3)

Run the code above in your browser using DataLab

Description

Usage

Arguments

Value

See Also

Examples