Function for conducting a Bayesian A/B test (i.e., test between two proportions).
ab_test(
  data = NULL,
  prior_par = list(mu_psi = 0, sigma_psi = 1, mu_beta = 0, sigma_beta = 1),
  prior_prob = NULL,
  nsamples = 10000,
  is_df = 5,
  posterior = FALSE,
  y = NULL,
  n = NULL
)list or data frame with the data. This list (data frame) needs to
contain the following elements: y1 (number of "successes" in the
control condition), n1 (number of trials in the control condition),
y2 (number of "successes" in the experimental condition), n2
(number of trials in the experimental condition). Each of these elements
needs to be an integer. Alternatively, the user can provide for each of the
elements a vector with a cumulative sequence of "successes"/trials. This
allows the user to produce a sequential plot of the posterior probabilities
for each hypothesis by passing the result object of class "ab" to
the plot_sequential function. Sequential data can also be
provided in form of a data frame or matrix that has the columns "outcome"
(containing only 0 and 1 to indicate the binary outcome) and "group"
(containing only 1 and 2 to indicate the group membership). Note that the
data can also be provided by specifying the arguments y and
n instead (not possible for sequential data).
list with prior parameters. This list needs to contain the
following elements: mu_psi (prior mean for the normal prior on the
test-relevant log odds ratio), sigma_psi (prior standard deviation
for the normal prior on the test-relevant log odds ratio), mu_beta
(prior mean for the normal prior on the grand mean of the log odds),
sigma_beta (prior standard deviation for the normal prior on the
grand mean of the log odds). Each of the elements needs to be a real number
(the standard deviations need to be positive). The default are standard
normal priors for both the log odds ratio parameter and the grand mean of
the log odds parameter.
named vector with prior probabilities for the four
hypotheses "H1", "H+", "H-", and "H0".
"H1" states that the "success" probability differs between the
control and the experimental condition but does not specify which one is
higher. "H+" states that the "success" proability in the
experimental condition is higher than in the control condition, "H-"
states that the "success" probability in the experimental condition is
lower than in the control condition. "H0" states that the "success"
probability is identical (i.e., there is no effect). The one-sided
hypotheses "H+" and "H-" are obtained by truncating the
normal prior on the log odds ratio so that it assigns prior mass only to
the allowed log odds ratio values (e.g., for "H+" a normal prior
that is truncated from below at 0). If NULL (default) the prior
probabilities are set to c(0, 1/4, 1/4, 1/2). That is, the default
assigns prior probability .5 to the hypothesis that there is no effect
(i.e., "H0"). The remaining prior probability (i.e., also .5) is
split evenly across the hypothesis that there is a positive effect (i.e.,
"H+") and the hypothesis that there is a negative effect (i.e.,
"H-").
determines the number of importance samples for obtaining the
log marginal likelihood for "H+" and "H-" and the number of
posterior samples in case posterior = TRUE. The default is
10000.
degrees of freedom of the multivariate t importance sampling
proposal density. The default is 5.
Boolean which indicates whether posterior samples should be
returned. The default is FALSE.
integer vector of length 2 containing the number of "successes" in the control and experimental conditon
integer vector of length 2 containing the number of trials in the control and experimental conditon
returns an object of class "ab" with components:
input: a list with the input arguments.
post: a
  list with parameter posterior samples for the three hypotheses "H1",
  "H+" (in the output called "Hplus"), and "H-" (in the
  output called "Hminus"). Only contains samples if posterior =
  TRUE.
laplace: a list with the approximate parameter
  posterior mode and variance/covariance matrix for each hypothesis obtained
  via a Laplace approximation.
method: character that indicates
  the method that has been used to obtain the results. The default is
  "log-is" (importance sampling with multivariate t proposal based on
  a Laplace approximation to the log transformed posterior). If this method
  fails (for the one-sided hypotheses), method "is-sn" is used (i.e.,
  importance sampling is used to obtain unconstrained samples, then a
  skew-normal distribution is fitted to the samples to obtain the results for
  the one-sided hypotheses). If method = "is-sn", posterior samples
  can only be obtained for "H1".
logml: a list with the
  estimated log marginal likelihoods for the hypotheses "H0" (i.e.,
  "logml0"), "H1" (i.e., "logml1"), "H+" (i.e.,
  "logmlplus"), and "H-" (i.e., "logmlminus").
post_prob: a named vector with the posterior probabilities of the
  four hypotheses "H1", "H+", "H-", and "H0".
logbf: a list with the log Bayes factor in favor of
  "H1" over "H0", the log Bayes factor in favor of "H+"
  over "H0", and the log Bayes factor in favor of "H-" over
  "H0".
bf: a list with the Bayes factor in favor of
  "H1" over "H0" (i.e., "bf10"), the Bayes factor in
  favor of "H+" over "H0" (i.e., "bfplus0"), and the
  Bayes factor in favor of "H-" over "H0" (i.e.,
  "bfminus0").
The implemented Bayesian A/B test is based on the following model by
  Kass and Vaidyanathan (1992, section 3): $$log(p1/(1 - p1)) = \beta -
  \psi/2$$ $$log(p2/(1 - p2)) = \beta + \psi/2$$ $$y1 ~ Binomial(n1,
  p1)$$ $$y2 ~ Binomial(n2, p2).$$ "H0" states that \(\psi = 0\),
  "H1" states that \(\psi != 0\), "H+" states that \(\psi
  > 0\), and "H-" states that \(\psi < 0\). Normal priors are
  assigned to the two parameters \(\psi\) (i.e., the test-relevant log odds
  ratio) and \(\beta\) (i.e., the grand mean of the log odds which is a
  nuisance parameter). Log marginal likelihoods for "H0" and
  "H1" are obtained via Laplace approximations (see Kass &
  Vaidyanathan, 1992) which work well even for very small sample sizes. For
  the one-sided hypotheses "H+" and "H-" the log marginal
  likelihoods are obtained based on importance sampling which uses as a
  proposal a multivariate t distribution with location and scale matrix
  obtained via a Laplace approximation to the (log-transformed) posterior. If
  posterior = TRUE, posterior samples are obtained using importance
  sampling.
Kass, R. E., & Vaidyanathan, S. K. (1992). Approximate Bayes factors and orthogonal parameters, with application to testing equality of two binomial proportions. Journal of the Royal Statistical Society, Series B, 54, 129-144. 10.1111/j.2517-6161.1992.tb01868.x
Gronau, Q. F., Raj K. N., A., & Wagenmakers, E.-J. (2021). Informed Bayesian Inference for the A/B Test. Journal of Statistical Software, 100. 10.18637/jss.v100.i17
elicit_prior allows the user to elicit a prior based
  on providing quantiles for either the log odds ratio, the odds ratio, the
  relative risk, or the absolute risk. The resulting prior is always
  translated to the corresponding normal prior on the log odds ratio. The
  plot_prior function allows the user to visualize the prior
  distribution. The simulate_priors function produces samples
  from the prior distribution. The prior and posterior probabilities of the
  hypotheses can be visualized using the prob_wheel function.
  Parameter posteriors can be visualized using the
  plot_posterior function. The plot_sequential
  function allows the user to sequentially plot the posterior probabilities
  of the hypotheses (only possible if the data object contains vectors
  with the cumulative "successes"/trials).
# NOT RUN {
# synthetic data
data <- list(y1 = 10, n1 = 28, y2 = 14, n2 = 26)
# Bayesian A/B test with default settings
ab <- ab_test(data = data)
print(ab)
# different prior parameter settings
prior_par <- list(mu_psi = 0.2, sigma_psi = 0.8,
                  mu_beta = 0, sigma_beta = 0.7)
ab2 <- ab_test(data = data, prior_par = prior_par)
print(ab2)
# different prior probabilities
prior_prob <- c(.1, .3, .2, .4)
names(prior_prob) <- c("H1", "H+", "H-", "H0")
ab3 <- ab_test(data = data, prior_prob = prior_prob)
print(ab3)
# also possible to obtain posterior samples
ab4 <- ab_test(data = data, posterior = TRUE)
# plot parameter posterior
plot_posterior(x = ab4, what = "logor")
# }
Run the code above in your browser using DataLab