Learn R Programming

VGAM (version 0.8-2)

dirmul.old: Fitting a Dirichlet-Multinomial Distribution

Description

Fits a Dirichlet-multinomial distribution to a matrix of non-negative integers.

Usage

dirmul.old(link = "loge", earg = list(),
           init.alpha = 0.01, parallel = FALSE, zero = NULL)

Arguments

link
Link function applied to each of the $M$ (positive) shape parameters $\alpha_j$ for $j=1,\ldots,M$. See Links for more choices. Here, $M$ is the number of columns of the response matrix.
earg
List. Extra argument for link. See earg in Links for general information.
init.alpha
Numeric vector. Initial values for the alpha vector. Must be positive. Recycled to length $M$.
parallel
A logical, or formula specifying which terms have equal/unequal coefficients.
zero
An integer-valued vector specifying which linear/additive predictors are modelled as intercepts only. The values must be from the set {1,2,...,$M$}.

Value

Details

The Dirichlet-multinomial distribution, which is somewhat similar to a Dirichlet distribution, has probability function $$P(Y_1=y_1,\ldots,Y_M=y_M) = {2y_{*} \choose {y_1,\ldots,y_M}} \frac{\Gamma(\alpha_{+})}{\Gamma(2y_{*}+\alpha_{+})} \prod_{j=1}^M \frac{\Gamma(y_j+\alpha_{j})}{\Gamma(\alpha_{j})}$$ for $\alpha_j > 0$, $\alpha_+ = \alpha_1 + \cdots + \alpha_M$, and $2y_{*} = y_1 + \cdots + y_M$. Here, $a \choose b$ means ``$a$ choose $b$'' and refers to combinations (see choose). The (posterior) mean is $$E(Y_j) = (y_j + \alpha_j) / (2y_{*} + \alpha_{+})$$ for $j=1,\ldots,M$, and these are returned as the fitted values as a $M$-column matrix.

References

Lange, K. (2002) Mathematical and Statistical Methods for Genetic Analysis, 2nd ed. New York: Springer-Verlag.

Evans, M., Hastings, N. and Peacock, B. (2000) Statistical Distributions, New York: Wiley-Interscience, Third edition.

Paul, S. R., Balasooriya, U. and Banerjee, T. (2005) Fisher information matrix of the Dirichlet-multinomial distribution. Biometrical Journal, 47, 230--236.

Tvedebrink, T. (2010) Overdispersion in allelic counts and $\theta$-correction in forensic genetics. Theoretical Population Biology, 78, 200--210.

Documentation accompanying the VGAM package at http://www.stat.auckland.ac.nz/~yee contains further information and examples.

See Also

dirmultinomial, dirichlet, betabin.ab, multinomial.

Examples

Run this code
# Data from p.50 of Lange (2002)
alleleCounts <- c(2, 84, 59, 41, 53, 131, 2, 0,
       0, 50, 137, 78, 54, 51, 0, 0,
       0, 80, 128, 26, 55, 95, 0, 0,
       0, 16, 40, 8, 68, 14, 7, 1)
dim(alleleCounts) <- c(8, 4)
alleleCounts <- data.frame(t(alleleCounts))
dimnames(alleleCounts) <- list(c("White","Black","Chicano","Asian"),
                    paste("Allele", 5:12, sep = ""))

set.seed(123)  # @initialize uses random numbers
fit <- vglm(cbind(Allele5,Allele6,Allele7,Allele8,Allele9,
                  Allele10,Allele11,Allele12) ~ 1, dirmul.old,
             trace = TRUE, crit = "c", data = alleleCounts)

(sfit <- summary(fit))
vcov(sfit)
round(eta2theta(coef(fit), fit@misc$link), dig = 2)  # not preferred
round(Coef(fit), dig = 2) # preferred  # preferred
round(t(fitted(fit)), dig = 4) # 2nd row of Table 3.5 of Lange (2002)
coef(fit, matrix = TRUE)


pfit <- vglm(cbind(Allele5,Allele6,Allele7,Allele8,Allele9,
                   Allele10,Allele11,Allele12) ~ 1,
             dirmul.old(parallel = TRUE), trace = TRUE,
             data = alleleCounts)
round(eta2theta(coef(pfit), pfit@misc$link), dig = 2) # not preferred
round(Coef(pfit), dig = 2)   # preferred

Run the code above in your browser using DataLab