Run the same brms model on multiple datasets and then combine the results into one fitted model object. This is useful in particular for multiple missing value imputation, where the same model is fitted on multiple imputed data sets. Models can be run in parallel using the future package.
brm_multiple(formula, data, family = gaussian(), prior = NULL,
autocor = NULL, cov_ranef = NULL, sample_prior = c("no", "yes",
"only"), sparse = NULL, knots = NULL, stanvars = NULL,
stan_funs = NULL, combine = TRUE, fit = NA, seed = NA,
file = NULL, ...)
An object of class formula
,
brmsformula
, or mvbrmsformula
(or one that can
be coerced to that classes): A symbolic description of the model to be
fitted. The details of model specification are explained in
brmsformula
.
A list of data.frames each of which will be used to fit a
separate model. Alternatively, a mids
object from the mice
package.
A description of the response distribution and link function to
be used in the model. This can be a family function, a call to a family
function or a character string naming the family. Every family function has
a link
argument allowing to specify the link function to be applied
on the response variable. If not specified, default links are used. For
details of supported families see brmsfamily
. By default, a
linear gaussian
model is applied. In multivariate models,
family
might also be a list of families.
An optional cor_brms
object describing the
correlation structure within the response variable (i.e., the
'autocorrelation'). See the documentation of cor_brms
for a
description of the available correlation structures. Defaults to
NULL
, corresponding to no correlations. In multivariate models,
autocor
might also be a list of autocorrelation structures.
A list of matrices that are proportional to the (within)
covariance structure of the group-level effects. The names of the matrices
should correspond to columns in data
that are used as grouping
factors. All levels of the grouping factor should appear as rownames of the
corresponding matrix. This argument can be used, among others to model
pedigrees and phylogenetic effects. See
vignette("brms_phylogenetics")
for more details.
Indicate if samples from all specified proper priors
should be drawn additionally to the posterior samples (defaults to
"no"
). Among others, these samples can be used to calculate Bayes
factors for point hypotheses via hypothesis
. If set to
"only"
, samples are drawn solely from the priors ignoring the
likelihood, which allows among others to generate samples from the prior
predictive distribution. In this case, all parameters must have proper
priors.
(Deprecated) Logical; indicates whether the population-level
design matrices should be treated as sparse (defaults to FALSE
). For
design matrices with many zeros, this can considerably reduce required
memory. Sampling speed is currently not improved or even slightly
decreased.
Optional list containing user specified knot values to be used
for basis construction of smoothing terms. See
gamm
for more details.
An optional stanvars
object generated by function
stanvar
to define additional variables for use in
Stan's program blocks.
(Deprecated) An optional character string containing
self-defined Stan functions, which will be included in the functions
block of the generated Stan code. It is now recommended to use the
stanvars
argument for this purpose, instead.
Logical; Indicates if the fitted models should be combined
into a single fitted model object via combine_models
.
Defaults to TRUE
.
An instance of S3 class brmsfit_multiple
derived from a
previous fit; defaults to NA
. If fit
is of class
brmsfit_multiple
, the compiled model associated with the fitted
result is re-used and all arguments modifying the model code or data are
ignored. It is not recommended to use this argument directly, but to call
the update
method, instead.
The seed for random number generation to make results
reproducible. If NA
(the default), Stan will set the seed
randomly.
Either NULL
or a character string. In the latter case, the
fitted model object is saved via saveRDS
in a file named
after the string supplied in file
. The .rds
extension is
added automatically. If the file already exists, brm
will load and
return the saved model object instead of refitting the model. As existing
files won't be overwritten, you have to manually remove the file in order
to refit and save the model under an existing file name. The file name
is stored in the brmsfit
object for later usage.
Further arguments passed to brm
.
If combine = TRUE
a brmsfit_multiple
object, which
inherits from class brmsfit
and behaves essentially the same. If
combine = FALSE
a list of brmsfit
objects.
The combined model may issue false positive convergence warnings, as
the MCMC chains corresponding to different datasets may not necessarily
overlap, even if each of the original models did converge. To find out
whether each of the original models converged, investigate
fit$rhats
, where fit
denotes the output of
brm_multiple
.
# NOT RUN {
library(mice)
imp <- mice(nhanes2)
# fit the model using mice and lm
fit_imp1 <- with(lm(bmi ~ age + hyp + chl), data = imp)
summary(pool(fit_imp1))
# fit the model using brms
fit_imp2 <- brm_multiple(bmi ~ age + hyp + chl, data = imp, chains = 1)
summary(fit_imp2)
plot(fit_imp2, pars = "^b_")
# investigate convergence of the original models
fit_imp2$rhats
# use the future package for parallelization
library(future)
plan(multiprocess)
fit_imp3 <- brm_multiple(bmi~age+hyp+chl, data = imp, chains = 1)
summary(fit_imp3)
# }
# NOT RUN {
# }
Run the code above in your browser using DataLab