Learn R Programming

brms (version 2.1.0)

kfold.brmsfit: K-Fold Cross-Validation

Description

Perform exact K-fold cross-validation by refitting the model \(K\) times each leaving out one-\(K\)th of the original data.

Usage

# S3 method for brmsfit
kfold(x, ..., compare = TRUE, K = 10, Ksub = NULL,
  exact_loo = FALSE, group = NULL, newdata = NULL, save_fits = FALSE,
  update_args = list())

kfold(x, ...)

Arguments

x

A fitted model object typically of class brmsfit.

...

Optionally more fitted model objects.

compare

A flag indicating if the information criteria of the models should be compared to each other via compare_ic.

K

The number of subsets of equal (if possible) size into which the data will be randomly partitioned for performing \(K\)-fold cross-validation. The model is refit K times, each time leaving out one of the K subsets. If K is equal to the total number of observations in the data then \(K\)-fold cross-validation is equivalent to exact leave-one-out cross-validation.

Ksub

Optional number of subsets (of those subsets defined by K) to be evaluated. If NULL (the default), \(K\)-fold cross-validation will be performed on all subsets. If Ksub is a single integer, Ksub subsets (out of all K) subsets will be randomly chosen. If Ksub consists of multiple integers, the corresponding subsets will be used. This argument is primarily useful, if evaluation of all subsets is infeasible for some reason.

exact_loo

Logical; If TRUE, exact leave-one-out cross-validation will be performed and K will be ignored. This argument alters the way argument group is handled as described below. Defaults to FALSE.

group

Optional name of a grouping variable or factor in the model. How this variable is handled depends on argument exact_loo. If exact_loo is FALSE, the data is split up into subsets, each time omitting all observations of one of the factor levels, while ignoring argument K. If exact_loo is TRUE, all observations corresponding to the factor level of the currently predicted single value are omitted. Thus, in this case, the predicted values are only a subset of the omitted ones.

newdata

An optional data.frame for which to evaluate predictions. If NULL (default), the original data of the model is used.

save_fits

If TRUE, a component fits is added to the returned object to store the cross-validated brmsfit objects and the indices of the omitted observations for each fold. Defaults to FALSE.

update_args

A list of further arguments passed to update.brmsfit such as iter, chains, or cores.

Value

kfold returns an object that has a similar structure as the objects returned by the loo and waic methods.

Methods (by class)

  • brmsfit: kfold method for brmsfit objects

Details

The kfold function performs exact \(K\)-fold cross-validation. First the data are randomly partitioned into \(K\) subsets of equal (or as close to equal as possible) size. Then the model is refit \(K\) times, each time leaving out one of the K subsets. If \(K\) is equal to the total number of observations in the data then \(K\)-fold cross-validation is equivalent to exact leave-one-out cross-validation (to which loo is an efficient approximation). The compare_ic function is also compatible with the objects returned by kfold.

See Also

loo, reloo

Examples

Run this code
# NOT RUN {
fit1 <- brm(count ~ log_Age_c + log_Base4_c * Trt + 
              (1|patient) + (1|obs),
           data = epilepsy, family = poisson())
# throws warning about some pareto k estimates being too high
(loo1 <- loo(fit1))
# perform 10-fold cross validation
(kfold1 <- kfold(fit1, chains = 2, cores = 2))
# }
# NOT RUN {
 
# }

Run the code above in your browser using DataLab