Learn R Programming

VGAM (version 0.8-3)

dirichlet: Fitting a Dirichlet Distribution

Description

Fits a Dirichlet distribution to a matrix of compositions.

Usage

dirichlet(link = "loge", earg=list(), parallel = FALSE, zero=NULL)

Arguments

link
Link function applied to each of the $M$ (positive) shape parameters $\alpha_j$. See Links for more choices. The default gives $\eta_j=\log(\alpha_j)$.
earg
List. Extra argument for the link. See earg in Links for general information.
parallel, zero
See CommonVGAMffArguments for more information.

Value

  • An object of class "vglmff" (see vglmff-class). The object is used by modelling functions such as vglm, rrvglm and vgam.

    When fitted, the fitted.values slot of the object contains the $M$-column matrix of means.

Details

In this help file the response is assumed to be a $M$-column matrix with positive values and whose rows each sum to unity. Such data can be thought of as compositional data. There are $M$ linear/additive predictors $\eta_j$.

The Dirichlet distribution is commonly used to model compositional data, including applications in genetics. Suppose $(Y_1,\ldots,Y_{M})^T$ is the response. Then it has a Dirichlet distribution if $(Y_1,\ldots,Y_{M-1})^T$ has density $$\frac{\Gamma(\alpha_{+})} {\prod_{j=1}^{M} \Gamma(\alpha_{j})} \prod_{j=1}^{M} y_j^{\alpha_{j} -1}$$ where $\alpha_+=\alpha_1+\cdots+\alpha_M$, $\alpha_j > 0$, and the density is defined on the unit simplex $$\Delta_{M} = \left{ (y_1,\ldots,y_{M})^T : y_1 > 0, \ldots, y_{M} > 0, \sum_{j=1}^{M} y_j = 1 \right}.$$ One has $E(Y_j) = \alpha_j / \alpha_{+}$, which are returned as the fitted values. For this distribution Fisher scoring corresponds to Newton-Raphson.

The Dirichlet distribution can be motivated by considering the random variables $(G_1,\ldots,G_{M})^T$ which are each independent and identically distributed as a gamma distribution with density $f(g_j)=g_j^{\alpha_j - 1} e^{-g_j} / \Gamma(\alpha_j)$. Then the Dirichlet distribution arises when $Y_j=G_j / (G_1 + \cdots + G_M)$.

References

Lange, K. (2002) Mathematical and Statistical Methods for Genetic Analysis, 2nd ed. New York: Springer-Verlag.

Evans, M., Hastings, N. and Peacock, B. (2000) Statistical Distributions, New York: Wiley-Interscience, Third edition.

See Also

rdiric, dirmultinomial, multinomial, simplex.

Examples

Run this code
y = rdiric(n=1000, shape=exp(c(-1,1,0)))
fit = vglm(y ~ 1, dirichlet, trace = TRUE, crit="c")
Coef(fit)
coef(fit, matrix=TRUE)
head(fitted(fit))

Run the code above in your browser using DataLab