betabin: Beta-binomial and chance-corrected beta-binomial models for over-dispersed binomial data

Description

Fits the beta-binomial model and the chance-corrected beta-binomial model to (over-dispersed) binomial data.

Usage

betabin(data, start = c(.5,.5),
        method = c("duotrio", "tetrad", "threeAFC", "twoAFC",
          "triangle", "hexad", "twofive", "twofiveF"),
        vcov = TRUE, corrected = TRUE, gradTol = 1e-4, ...)
# S3 method for betabin
summary(object, level = 0.95, ...)

Arguments

data

matrix or data.frame with two columns; first column contains the number of success and the second the total number of cases. The number of rows should correspond to the number of observations.

start

starting values to be used in the optimization

vcov

logical, should the variance-covariance matrix of the parameters be computed?

method

the sensory discrimination protocol for which d-prime and its standard error should be computed

corrected

should the chance corrected or the standard beta binomial model be estimated?

gradTol

a warning is issued if max|gradient| < gradTol, where 'gradient' is the gradient at the values at which the optimizer terminates. This is not used as a termination or convergence criterion during model fitting.

object

an object of class "betabin", i.e. the result of betabin().

level

the confidence level of the confidence intervals computed by the summary method

…

betabin: The only recognized (hidden) argument is doFit (boolean) which by default is TRUE. When FALSE betabin returns an environment which facilitates examination of the likelihood surface via the (hidden) functions sensR:::getParBB and sensR:::setParBB. Not used in summary.betabin.

Value

An object of class betabin with elements

coefficients

named vector of coefficients

vcov

variance-covariance matrix of the parameter estimates if vcov = TRUE

data

the data supplied to the function

call

the matched call

logLik

the value of the log-likelihood at the MLEs

method

the method used for the fit

convergence

0 indicates convergence. For other error messages, see optim.

message

possible error message - see optim for details

counts

the number of iterations used in the optimization - see optim for details

corrected

is the chance corrected model estimated?

logLikNull

log-likelihood of the binomial model with prop = pGuess

logLikMu

log-likelihood of a binomial model with prop = sum(x)/sum(n)

Details

The beta-binomial models are parameterized in terms of mu and gamma, where mu corresponds to a probability parameter and gamma measures over-dispersion. Both parameters are restricted to the interval (0, 1). The parameters of the standard (i.e. corrected = FALSE) beta-binomial model refers to the mean (i.e. probability) and dispersion on the scale of the observations, i.e. on the scale where we talk of a probability of a correct answer (Pc). The parameters of the chance corrected (i.e. corrected = TRUE) beta-binomial model refers to the mean and dispersion on the scale of the "probability of discrimination" (Pd). The mean parameter (mu) is therefore restricted to the interval from zero to one in both models, but they have different interpretations.

The summary method use the estimate of mu to infer the parameters of the sensory experiment; Pc, Pd and d-prime. These are restricted to their allowed ranges, e.g. Pc is always at least as large as the guessing probability.

Confidens intervals are computed as Wald (normal-based) intervals on the mu-scale and the confidence limits are subsequently transformed to the Pc, Pd and d-prime scales. Confidence limits are restricted to the allowed ranges of the parameters, for example no confidence limits will be less than zero.

Standard errors, and therefore also confidence intervals, are only available if the parameters are not at the boundary of their allowed range (parameter space). If parameters are close to the boundaries of their allowed range, standard errors, and also confidence intervals, may be misleading. The likelihood ratio tests are more accurate. More accurate confidence intervals such as profile likelihood intervals may be implemented in the future.

The summary method provides a likelihood ratio test of over-dispersion on one degree of freedom and a likelihood ratio test of association (i.e. where the null hypothesis is "no difference" and the alternative hypothesis is "any difference") on two degrees of freedom (chi-square tests). Since the gamma parameter is tested on the boundary of the parameter space, the correct degree of freedom for the first test is probably 1/2 rather than one, or somewhere in between, and the latter test is probably also on less than two degrees of freedom. Research is needed to determine the appropriate no. degrees of freedom to use in each case. The choices used here are believed to be conservative, so the stated p-values are probably a little too large.

The log-likelihood of the standard beta-binomial model is $$\ell(\alpha, \beta; x, n) = \sum_{j=1}^N \left\{ \log {n_j \choose x_j} - \log Beta(\alpha, \beta) + \log Beta(\alpha + x_j, \beta - x_j + n_j) \right\} $$

and the log-likelihood of the chance corrected beta-binomial model is $$\ell(\alpha, \beta; x, n) = \sum_{j=1}^N \left\{ C + \log \left[ \sum_{i=0}^{x_j} {{x_j} \choose i} (1-p_g)^{n_j-x_j+i} p_g^{x_j-i} Beta(\alpha + i, n_j - x_j + \beta) \right] \right\} $$ where $$C = \log {n_j \choose x_j} - \log Beta(\alpha, \beta) $$

and where $\mu = \alpha/(\alpha + \beta)$, $\gamma = 1/(\alpha + \beta + 1)$, $Beta$ is the Beta function, cf. beta, $N$ is the number of independent binomial observations, i.e.~the number of rows in data, and $p_g$ is the guessing probability, pGuess.

The variance-covariance matrix (and standard errors) is based on the inverted Hessian at the optimum. The Hessian is obtained with the hessian function from the numDeriv package.

The gradient at the optimum is evaluated with gradient from the numDeriv package.

The bounded optimization is performed with the "L-BFGS-B" optimizer in optim.

The following additional methods are implemented objects of class betabin: print, vcov and logLik.

References

Brockhoff, P.B. (2003). The statistical power of replications in difference tests. Food Quality and Preference, 14, pp. 405--417.

Examples

Run this code

# NOT RUN {
## Create data:
x <- c(3,2,6,8,3,4,6,0,9,9,0,2,1,2,8,9,5,7)
n <- c(10,9,8,9,8,6,9,10,10,10,9,9,10,10,10,10,9,10)
dat <- data.frame(x, n)

## Chance corrected beta-binomial model:
(bb0 <- betabin(dat, method = "duotrio"))
summary(bb0)
## Un-corrected beta-binomial model:
(bb <- betabin(dat, corrected = FALSE, method = "duotrio"))
summary(bb)
vcov(bb)
logLik(bb)
AIC(bb)
coef(bb)
# }

Run the code above in your browser using DataLab