Learn R Programming

brms (version 2.22.0)

custom_family: Custom Families in brms Models

Description

Define custom families (i.e. response distribution) for use in brms models. It allows users to benefit from the modeling flexibility of brms, while applying their self-defined likelihood functions. All of the post-processing methods for brmsfit objects can be made compatible with custom families. See vignette("brms_customfamilies") for more details. For a list of built-in families see brmsfamily.

Usage

custom_family(
  name,
  dpars = "mu",
  links = "identity",
  type = c("real", "int"),
  lb = NA,
  ub = NA,
  vars = NULL,
  loop = TRUE,
  specials = NULL,
  threshold = "flexible",
  log_lik = NULL,
  posterior_predict = NULL,
  posterior_epred = NULL,
  predict = NULL,
  fitted = NULL,
  env = parent.frame()
)

Value

An object of class customfamily inheriting from class brmsfamily.

Arguments

name

Name of the custom family.

dpars

Names of the distributional parameters of the family. One parameter must be named "mu" and the main formula of the model will correspond to that parameter.

links

Names of the link functions of the distributional parameters.

type

Indicates if the response distribution is continuous ("real") or discrete ("int"). This controls if the corresponding density function will be named with <name>_lpdf or <name>_lpmf.

lb

Vector of lower bounds of the distributional parameters. Defaults to NA that is no lower bound.

ub

Vector of upper bounds of the distributional parameters. Defaults to NA that is no upper bound.

vars

Names of variables that are part of the likelihood function without being distributional parameters. That is, vars can be used to pass data to the likelihood. Such arguments will be added to the list of function arguments at the end, after the distributional parameters. See stanvar for details about adding self-defined data to the generated Stan model. Addition arguments vreal and vint may be used for this purpose as well (see Examples below). See also brmsformula and addition-terms for more details.

loop

Logical; Should the likelihood be evaluated via a loop (TRUE; the default) over observations in Stan? If FALSE, the Stan code will be written in a vectorized manner over observations if possible.

specials

A character vector of special options to enable for this custom family. Currently for internal use only.

threshold

Optional threshold type for custom ordinal families. Ignored for non-ordinal families.

log_lik

Optional function to compute log-likelihood values of the model in R. This is only relevant if one wants to ensure compatibility with method log_lik.

posterior_predict

Optional function to compute posterior prediction of the model in R. This is only relevant if one wants to ensure compatibility with method posterior_predict.

posterior_epred

Optional function to compute expected values of the posterior predictive distribution of the model in R. This is only relevant if one wants to ensure compatibility with method posterior_epred.

predict

Deprecated alias of `posterior_predict`.

fitted

Deprecated alias of `posterior_epred`.

env

An environment in which certain post-processing functions related to the custom family can be found, if there were not directly passed to custom_family. This is only relevant if one wants to ensure compatibility with the methods log_lik, posterior_predict, or posterior_epred. By default, env is the environment from which custom_family is called.

Details

The corresponding probability density or mass Stan functions need to have the same name as the custom family. That is if a family is called myfamily, then the Stan functions should be called myfamily_lpdf or myfamily_lpmf depending on whether it defines a continuous or discrete distribution.

See Also

brmsfamily, brmsformula, stanvar

Examples

Run this code
if (FALSE) {
## demonstrate how to fit a beta-binomial model
## generate some fake data
phi <- 0.7
n <- 300
z <- rnorm(n, sd = 0.2)
ntrials <- sample(1:10, n, replace = TRUE)
eta <- 1 + z
mu <- exp(eta) / (1 + exp(eta))
a <- mu * phi
b <- (1 - mu) * phi
p <- rbeta(n, a, b)
y <- rbinom(n, ntrials, p)
dat <- data.frame(y, z, ntrials)

# define a custom family
beta_binomial2 <- custom_family(
  "beta_binomial2", dpars = c("mu", "phi"),
  links = c("logit", "log"), lb = c(NA, 0),
  type = "int", vars = "vint1[n]"
)

# define the corresponding Stan density function
stan_density <- "
  real beta_binomial2_lpmf(int y, real mu, real phi, int N) {
    return beta_binomial_lpmf(y | N, mu * phi, (1 - mu) * phi);
  }
"
stanvars <- stanvar(scode = stan_density, block = "functions")

# fit the model
fit <- brm(y | vint(ntrials) ~ z, data = dat,
           family = beta_binomial2, stanvars = stanvars)
summary(fit)


# define a *vectorized* custom family (no loop over observations)
# notice also that 'vint' no longer has an observation index
beta_binomial2_vec <- custom_family(
  "beta_binomial2", dpars = c("mu", "phi"),
  links = c("logit", "log"), lb = c(NA, 0),
  type = "int", vars = "vint1", loop = FALSE
)

# define the corresponding Stan density function
stan_density_vec <- "
  real beta_binomial2_lpmf(array[] int y, vector mu, real phi, array[] int N) {
    return beta_binomial_lpmf(y | N, mu * phi, (1 - mu) * phi);
  }
"
stanvars_vec <- stanvar(scode = stan_density_vec, block = "functions")

# fit the model
fit_vec <- brm(y | vint(ntrials) ~ z, data = dat,
           family = beta_binomial2_vec,
           stanvars = stanvars_vec)
summary(fit_vec)
}

Run the code above in your browser using DataLab