The Stick
function provides the utility of truncated
stick-breaking regarding the vector
\(\theta\). Stick-breaking is commonly referred to as a
stick-breaking process, and is used often in a Dirichlet
process (Sethuraman, 1994). It is commonly associated with
infinite-dimensional mixtures, but in practice, the `infinite' number
is truncated to a finite number, since it is impossible to estimate an
infinite number of parameters (Ishwaran and James, 2001).
Stick(theta)
This required argument, \(\theta\) is a vector of length \((M-1)\) regarding \(M\) mixture components.
The Stick
function returns a probability vector wherein each
element relates to a mixture component.
The Dirichlet process (DP) is a stochastic process used in Bayesian nonparametric modeling, most commonly in DP mixture models, otherwise known as infinite mixture models. A DP is a distribution over distributions. Each draw from a DP is itself a discrete distribution. A DP is an infinite-dimensional generalization of Dirichlet distributions. It is called a DP because it has Dirichlet-distributed, finite-dimensional, marginal distributions, just as the Gaussian process has Gaussian-distributed, finite-dimensional, marginal distributions. Distributions drawn from a DP cannot be described using a finite number of parameters, thus the classification as a nonparametric model. The truncated stick-breaking (TSB) process is associated with a truncated Dirichlet process (TDP).
An example of a TSB process is cluster analysis, where the number of
clusters is unknown and treated as mixture components. In such a
model, the TSB process calculates probability vector \(\pi\)
from \(\theta\), given a user-specified maximum number of
clusters to explore as \(C\), where \(C\) is the length of
\(\theta + 1\). Vector \(\pi\) is assigned a TSB
prior distribution (for more information, see dStick
).
Elsewhere, each element of \(\theta\) is constrained to the interval (0,1), and the original TSB form is beta-distributed with the \(\alpha\) parameter of the beta distribution constrained to 1 (Ishwaran and James, 2001). The \(\beta\) hyperparameter in the beta distribution is usually gamma-distributed.
A larger value for a given \(\theta_m\) is associated with a higher probability of the associated mixture component, however, the proportion changes according to the position of the element in the \(\theta\) vector.
A variety of stick-breaking processes exist. For example, rather than each \(\theta\) being beta-distributed, there have been other forms introduced such as logistic and probit, among others.
Ishwaran, H. and James, L. (2001). "Gibbs Sampling Methods for Stick Breaking Priors". Journal of the American Statistical Association, 96(453), p. 161--173.
Sethuraman, J. (1994). "A Constructive Definition of Dirichlet Priors". Statistica Sinica, 4, p. 639--650.
ddirichlet
,
dmvpolya
, and
dStick
.