In this help file the response is assumed to be a \(M\)-column
matrix with positive values and whose rows each sum to unity.
Such data can be thought of as compositional data.
There are \(M\) linear/additive predictors \(\eta_j\).
The Dirichlet distribution is commonly used to model compositional
data, including applications in genetics.
Suppose \((Y_1,\ldots,Y_{M})^T\) is
the response. Then it has a Dirichlet distribution if
\((Y_1,\ldots,Y_{M-1})^T\) has density
$$\frac{\Gamma(\alpha_{+})}
{\prod_{j=1}^{M} \Gamma(\alpha_{j})}
\prod_{j=1}^{M} y_j^{\alpha_{j} -1}$$
where \(\alpha_+=\alpha_1+\cdots+\alpha_M\),
\(\alpha_j > 0\),
and the density is defined on the unit simplex
$$\Delta_{M} = \left\{
(y_1,\ldots,y_{M})^T :
y_1 > 0, \ldots, y_{M} > 0,
\sum_{j=1}^{M} y_j = 1 \right\}. $$
One has \(E(Y_j) = \alpha_j / \alpha_{+}\),
which are returned as the fitted values.
For this distribution Fisher scoring corresponds to Newton-Raphson.
The Dirichlet distribution can be motivated by considering the random variables
\((G_1,\ldots,G_{M})^T\) which are each independent
and identically distributed as a gamma distribution with density
\(f(g_j)=g_j^{\alpha_j - 1} e^{-g_j} / \Gamma(\alpha_j)\).
Then the Dirichlet distribution arises when
\(Y_j=G_j / (G_1 + \cdots + G_M)\).