tune_bayes: Bayesian optimization of model parameters.

Description

tune_bayes() uses models to generate new candidate tuning parameter combinations based on previous results.

Usage

tune_bayes(object, ...)
# S3 method for model_spec
tune_bayes(
  object,
  preprocessor,
  resamples,
  ...,
  iter = 10,
  param_info = NULL,
  metrics = NULL,
  objective = exp_improve(),
  initial = 5,
  control = control_bayes()
)
# S3 method for workflow
tune_bayes(
  object,
  resamples,
  ...,
  iter = 10,
  param_info = NULL,
  metrics = NULL,
  objective = exp_improve(),
  initial = 5,
  control = control_bayes()
)

Value

A tibble of results that mirror those generated by tune_grid(). However, these results contain an .iter column and replicate the rset

object multiple times over iterations (at limited additional memory costs).

Arguments

object: A parsnip model specification or a workflows::workflow().
...: Options to pass to GPfit::GP_fit() (mostly for the corr argument).
preprocessor: A traditional model formula or a recipe created using recipes::recipe().
resamples: An rset() object.
iter: The maximum number of search iterations.
param_info: A dials::parameters() object or NULL. If none is given, a parameters set is derived from other arguments. Passing this argument can be useful when parameter ranges need to be customized.
metrics: A yardstick::metric_set() object containing information on how models will be evaluated for performance. The first metric in metrics is the one that will be optimized.
objective: A character string for what metric should be optimized or an acquisition function object.
initial: An initial set of results in a tidy format (as would result from tune_grid()) or a positive integer. It is suggested that the number of initial results be greater than the number of parameters being optimized.
control: A control object created by control_bayes()

Parallel Processing

The foreach package is used here. To execute the resampling iterations in parallel, register a parallel backend function. See the documentation for foreach::foreach() for examples.

For the most part, warnings generated during training are shown as they occur and are associated with a specific resample when control_bayes(verbose = TRUE). They are (usually) not aggregated until the end of processing.

For Bayesian optimization, parallel processing is used to estimate the resampled performance values once a new candidate set of values are estimated.

Initial Values

The results of tune_grid(), or a previous run of tune_bayes() can be used in the initial argument. initial can also be a positive integer. In this case, a space-filling design will be used to populate a preliminary set of results. For good results, the number of initial values should be more than the number of parameters being optimized.

Parameter Ranges and Values

In some cases, the tuning parameter values depend on the dimensions of the data (they are said to contain unknown values). For example, mtry in random forest models depends on the number of predictors. In such cases, the unknowns in the tuning parameter object must be determined beforehand and passed to the function via the param_info argument. dials::finalize() can be used to derive the data-dependent parameters. Otherwise, a parameter set can be created via dials::parameters(), and the dials update() function can be used to specify the ranges or values.

Performance Metrics

To use your own performance metrics, the yardstick::metric_set() function can be used to pick what should be measured for each model. If multiple metrics are desired, they can be bundled. For example, to estimate the area under the ROC curve as well as the sensitivity and specificity (under the typical probability cutoff of 0.50), the metrics argument could be given:


  metrics = metric_set(roc_auc, sens, spec)

Each metric is calculated for each candidate model.

If no metric set is provided, one is created:

For regression models, the root mean squared error and coefficient of determination are computed.
For classification, the area under the ROC curve and overall accuracy are computed.

Note that the metrics also determine what type of predictions are estimated during tuning. For example, in a classification problem, if metrics are used that are all associated with hard class predictions, the classification probabilities are not created.

The out-of-sample estimates of these metrics are contained in a list column called .metrics. This tibble contains a row for each metric and columns for the value, the estimator type, and so on.

collect_metrics() can be used for these objects to collapse the results over the resampled (to obtain the final resampling estimates per tuning parameter combination).

Obtaining Predictions

When control_bayes(save_pred = TRUE), the output tibble contains a list column called .predictions that has the out-of-sample predictions for each parameter combination in the grid and each fold (which can be very large).

The elements of the tibble are tibbles with columns for the tuning parameters, the row number from the original data object (.row), the outcome data (with the same name(s) of the original data), and any columns created by the predictions. For example, for simple regression problems, this function generates a column called .pred and so on. As noted above, the prediction columns that are returned are determined by the type of metric(s) requested.

This list column can be unnested using tidyr::unnest() or using the convenience function collect_predictions().

Extracting Information

The extract control option will result in an additional function to be returned called .extracts. This is a list column that has tibbles containing the results of the user's function for each tuning parameter combination. This can enable returning each model and/or recipe object that is created during resampling. Note that this could result in a large return object, depending on what is returned.

The control function contains an option (extract) that can be used to retain any model or recipe that was created within the resamples. This argument should be a function with a single argument. The value of the argument that is given to the function in each resample is a workflow object (see workflows::workflow() for more information). There are two helper functions that can be used to easily pull out the recipe (if any) and/or the model: extract_recipe() and extract_model().

As an example, if there is interest in getting each model back, one could use:


  extract = function (x) extract_fit_parsnip(x)

Note that the function given to the extract argument is evaluated on every model that is fit (as opposed to every model that is evaluated). As noted above, in some cases, model predictions can be derived for sub-models so that, in these cases, not every row in the tuning parameter grid has a separate R object associated with it.

Details

The optimization starts with a set of initial results, such as those generated by tune_grid(). If none exist, the function will create several combinations and obtain their performance estimates.

Using one of the performance estimates as the model outcome, a Gaussian process (GP) model is created where the previous tuning parameter combinations are used as the predictors.

A large grid of potential hyperparameter combinations is predicted using the model and scored using an acquisition function. These functions usually combine the predicted mean and variance of the GP to decide the best parameter combination to try next. For more information, see the documentation for exp_improve() and the corresponding package vignette.

The best combination is evaluated using resampling and the process continues.