Perform exact K-fold cross-validation by refitting the model \(K\) times each leaving out one-\(K\)th of the original data.
# S3 method for brmsfit
kfold(x, ..., compare = TRUE, K = 10, Ksub = NULL,
exact_loo = FALSE, group = NULL, resp = NULL, model_names = NULL,
save_fits = FALSE)kfold(x, ...)
A fitted model object.
More fitted model objects or further arguments passed to the underlying post-processing functions.
A flag indicating if the information criteria
of the models should be compared to each other
via compare_ic
.
The number of subsets of equal (if possible) size
into which the data will be randomly partitioned for performing
\(K\)-fold cross-validation. The model is refit K
times, each time
leaving out one of the K
subsets. If K
is equal to the total
number of observations in the data then \(K\)-fold cross-validation is
equivalent to exact leave-one-out cross-validation.
Optional number of subsets (of those subsets defined by K
)
to be evaluated. If NULL
(the default), \(K\)-fold cross-validation
will be performed on all subsets. If Ksub
is a single integer,
Ksub
subsets (out of all K
) subsets will be randomly chosen.
If Ksub
consists of multiple integers or a one-dimensional array
(created via as.array
) potentially of length one, the corresponding
subsets will be used. This argument is primarily useful, if evaluation of
all subsets is infeasible for some reason.
Logical; If TRUE
, exact leave-one-out cross-validation
will be performed and K
will be ignored. This argument alters
the way argument group
is handled as described below.
Defaults to FALSE
.
Optional name of a grouping variable or factor in the model.
How this variable is handled depends on argument exact_loo
.
If exact_loo
is FALSE
, the data is split
up into subsets, each time omitting all observations of one of the
factor levels, while ignoring argument K
.
If exact_loo
is TRUE
, all observations corresponding
to the factor level of the currently predicted single value are omitted.
Thus, in this case, the predicted values are only a subset of the
omitted ones.
Optional names of response variables. If specified, fitted values of these response variables are returned.
If NULL
(the default) will use model names
derived from deparsing the call. Otherwise will use the passed
values as model names.
If TRUE
, a component fits
is added to
the returned object to store the cross-validated brmsfit
objects and the indices of the omitted observations for each fold.
Defaults to FALSE
.
kfold
returns an object that has a similar structure as the
objects returned by the loo
and waic
methods.
brmsfit
: kfold
method for brmsfit
objects
The kfold
function performs exact \(K\)-fold
cross-validation. First the data are randomly partitioned into \(K\)
subsets of equal (or as close to equal as possible) size. Then the model is
refit \(K\) times, each time leaving out one of the K
subsets. If
\(K\) is equal to the total number of observations in the data then
\(K\)-fold cross-validation is equivalent to exact leave-one-out
cross-validation (to which loo
is an efficient approximation). The
compare_ic
function is also compatible with the objects returned
by kfold
.
# NOT RUN {
fit1 <- brm(count ~ log_Age_c + log_Base4_c * Trt +
(1|patient) + (1|obs),
data = epilepsy, family = poisson())
# throws warning about some pareto k estimates being too high
(loo1 <- loo(fit1))
# perform 10-fold cross validation
(kfold1 <- kfold(fit1, chains = 2, cores = 2)
# }
# NOT RUN {
# }
Run the code above in your browser using DataLab