dkbinom: Probability functions for the sum of k independent binomials

Description

The mass and distribution functions of the sum of k independent binomial random variables, with possibly different probabilities.

Usage

dkbinom(x, size, prob, log = FALSE, verbose = FALSE,
        method = c("butler", "fft"))
pkbinom(q, size, prob, log.p = FALSE, verbose = FALSE,
        method = c("butler", "naive", "fft"))

Arguments

Vector of values at which to evaluate the mass function of the sum of the k binomial variates

size

Vector of the number of trials

prob

Vector of the probabilities of success

log, log.p

logical; if TRUE, probabilities p are given as log(p) (see Note).

verbose

= TRUE produces output that shows the iterations of the convolutions and 3 arrays, A, B, and C that are used to convolve and reconvolve the distributions. Array C is the final result. See the source code in dkbinom.c for more details.

method

A character string that uniquely indicates the method. butler is the preferred (and default) method, which uses the algorithm given by Butler, et al. The naive method is an alternative approach that can be much slower that can handle no more the sum of five binomials, but is useful for validating the other methods. The naive method only works for a single value of q. The fft method uses the fast Fourier transform to compute the convolution of k binomial random variates, and is also useful for checking the other methods.

Vector of quantiles (value at which to evaluate the distribution function) of the sum of the k binomial variates

Value

dkbinom gives the mass function, pkbinom gives the distribution function.

Details

size[1] and prob[1] are the size and probability of the first binomial variate, size[2] and prob[2] are the size and probability of the second binomial variate, etc.

If the elements of prob are all the same, then pbinom or dbinom is used. Otherwise, repeating convolutions of the k binomials are used to calculate the mass or the distribution functions.

References

The Butler method is based on the exact algorithm discussed by: Butler, Ken and Stephens, Michael. (1993) The Distribution of a Sum of Binomial Random Variables. Technical Report No. 467, Department of Statistics, Stanford University. http://www.dtic.mil/dtic/tr/fulltext/u2/a266969.pdf

Examples

Run this code

# NOT RUN {
# A sum of 3 binomials...
dkbinom(c(0, 4, 7), c(3, 4, 2), c(0.3, 0.5, 0.8))
dkbinom(c(0, 4, 7), c(3, 4, 2), c(0.3, 0.5, 0.8), method = "b")
pkbinom(c(0, 4, 7), c(3, 4, 2), c(0.3, 0.5, 0.8))
pkbinom(c(0, 4, 7), c(3, 4, 2), c(0.3, 0.5, 0.8), method = "b")

# }
# NOT RUN {
# Compare the output of the 3 methods
pkbinom(4, c(3, 4, 2), c(0.3, 0.5, 0.8), method = "fft")
pkbinom(4, c(3, 4, 2), c(0.3, 0.5, 0.8), method = "butler")
pkbinom(4, c(3, 4, 2), c(0.3, 0.5, 0.8), method = "naive")

# Some inputs
n <- c(30000, 40000, 20000)
p <- c(0.02, 0.01, 0.005)

# Compare timings
x1 <- timeIt(pkbinom(1100, n, p, method = "butler"))
x2 <- timeIt(pkbinom(1100, n, p, method = "naive"))
x3 <- timeIt(pkbinom(1100, n, p, method = "fft"))
pvar(x1, x1 - x2, x2 - x3, x1 - x3, digits = 12)
# }

Run the code above in your browser using DataLab