Usage
brm(formula, data, family = gaussian(), prior = NULL, autocor = NULL, nonlinear = NULL, threshold = c("flexible", "equidistant"), cov_ranef = NULL, save_ranef = TRUE, save_mevars = FALSE, sparse = FALSE, sample_prior = FALSE, knots = NULL, stan_funs = NULL, fit = NA, inits = "random", chains = 4, iter = 2000, warmup = floor(iter/2), thin = 1, cores = getOption("mc.cores", 1L), control = NULL, algorithm = c("sampling", "meanfield", "fullrank"), silent = TRUE, seed = 12345, save_model = NULL, save_dso = TRUE, ...)
Arguments
formula
An object of class
brmsformula
(or one that can be coerced to that class):
a symbolic description of the model to be fitted.
The details of model specification are explained in
brmsformula
. data
An object of class data.frame
(or one that can be coerced to that class)
containing data of all variables used in the model.
family
A description of the response distribution and link function
to be used in the model. This can be a family function,
a call to a family function or a character string naming the family.
Every family function has a link
argument allowing to specify
the link function to be applied on the response variable.
If not specified, default links are used.
For details of supported families see
brmsfamily
. prior
One or more brmsprior
objects created by function
set_prior
and combined using the c
method.
A single brmsprior
object may be passed without c()
surrounding it.
See also get_prior
for more help. autocor
An optional cor_brms
object describing
the correlation structure
within the response variable (i.e. the 'autocorrelation').
See the documentation of cor_brms
for a description
of the available correlation structures. Defaults to NULL,
corresponding to no correlations. nonlinear
An optional list of formulas, specifying
linear models for non-linear parameters. If NULL
(the default)
formula
is treated as an ordinary formula.
If not NULL
, formula
is treated as a non-linear model
and nonlinear
should contain a formula for each non-linear
parameter, which has the parameter on the left hand side and its
linear predictor on the right hand side.
Alternatively, it can be a single formula with all non-linear
parameters on the left hand side (separated by a +
) and a
common linear predictor on the right hand side.
More information is given under 'Details'.
threshold
A character string indicating the type of thresholds
(i.e. intercepts) used in an ordinal model.
"flexible"
provides the standard unstructured thresholds and
"equidistant"
restricts the distance between
consecutive thresholds to the same value.
cov_ranef
A list of matrices that are proportional to the
(within) covariance structure of the group-level effects.
The names of the matrices should correspond to columns
in data
that are used as grouping factors.
All levels of the grouping factor should appear as rownames
of the corresponding matrix. This argument can be used,
among others, to model pedigrees and phylogenetic effects.
save_ranef
A flag to indicate if group-level effects
for each level of the grouping factor(s)
should be saved (default is TRUE
).
Set to FALSE
to save memory.
The argument has no impact on the model fitting itself.
A deprecated alias is ranef
.
save_mevars
A flag to indicate if samples
of noise-free variables obtained by using me
terms
should be saved (default is FALSE
).
Saving these samples allows to use methods such as
predict
with the noise-free variables but
leads to very large R objects even for models
of moderate size and complexity.
sparse
Logical; indicates whether the population-level
design matrix should be treated as sparse (defaults to FALSE
).
For design matrices with many zeros, this can considerably
reduce required memory. For univariate sparse models, it may be
sensible to prevent the design matrix from being centered
(see 'Details' for more information), as centering may
reduce sparsity.
For all models using multivariate syntax
(i.e. multivariate linear models, zero-inflated and hurdle models
as well as categorical models), setting sparse = TRUE
,
is generally worth a try to decrease memory requirements.
However, sampling speed is currently not improved or even
slightly decreased.
sample_prior
A flag to indicate if samples from all specified
proper priors should be drawn additionally to the posterior samples
(defaults to FALSE
). Among others, these samples can be used
to calculate Bayes factors for point hypotheses.
Alternatively, sample_prior
can be set to "only"
to
sample solely from the priors. In this case, all parameters must
have proper priors.
knots
Optional list containing user specified knot values to be
used for basis construction of smoothing terms. For details see
gamm
. stan_funs
An optional character string containing self-defined
Stan functions, which will be included in the functions block
of the generated Stan code.
fit
An instance of S3 class brmsfit
derived from a previous fit;
defaults to NA
.
If fit
is of class brmsfit
, the compiled model associated
with the fitted result is re-used and all arguments
modifying the model code or data are ignored.
inits
Either "random"
or "0"
.
If inits is "random"
(the default),
Stan will randomly generate initial values for parameters.
If it is "0"
, all parameters are initiliazed to zero.
This option is recommended for exponential
and weibull
models,
as it happens that default ("random"
) inits cause samples
to be essentially constant.
Generally, setting inits = "0"
is worth a try,
if chains do not behave well.
Alternatively, inits
can be a list of lists containing
the initial values, or a function (or function name) generating initial values.
The latter options are mainly implemented for internal testing.
chains
Number of Markov chains (defaults to 4).
iter
Number of total iterations per chain (including warmup; defaults to 2000).
warmup
A positive integer specifying number of warmup (aka burnin) iterations.
This also specifies the number of iterations used for stepsize adaptation,
so warmup samples should not be used for inference. The number of warmup should not
be larger than iter
and the default is iter/2
.
thin
Thinning rate. Must be a positive integer.
Set thin > 1
to save memory and computation time if iter
is large.
cores
Number of cores to use when executing the chains in parallel,
which defaults to 1 but we recommend setting the mc.cores
option
to be as many processors as the hardware and RAM allow (up to the number of chains).
For non-Windows OS in non-interactive R sessions, forking is used
instead of PSOCK clusters. A deprecated alias is cluster
.
control
A named list
of parameters to control the sampler's behavior.
It defaults to NULL
so all the default values are used.
The most important control parameters are discussed in the 'Details'
section below. For a comprehensive overview see stan
. algorithm
Character string indicating the estimation approach to use.
Can be "sampling"
for MCMC (the default), "meanfield"
for
variational inference with independent normal distributions, or
"fullrank"
for variational inference with a multivariate normal
distribution.
silent
logical; If TRUE
, informational messages of
the compiler and sampler are suppressed.
seed
Used by set.seed
to make results reproducable.
save_model
Either NULL
or a character string.
In the latter case, the model code is
saved in a file named after the string supplied in save_model
,
which may also contain the full path where to save the file.
If only a name is given, the file is saved in the current working directory.
save_dso
Logical, defaulting to TRUE
, indicating whether
the dynamic shared object (DSO) compiled from the C++ code for the model
will be saved or not. If TRUE
, we can draw samples from the same
model in another R session using the saved DSO
(i.e., without compiling the C++ code again).
...
Further arguments to be passed to Stan.