Posterior sampling and Bayesian model selection to choose the number of components k in multivariate Normal mixtures.
bfnormmix
computes posterior probabilities under non-local
MOM-IW-Dir(q) priors, and also for local Normal-IW-Dir(q.niw) priors.
It also computes posterior probabilities on cluster occupancy
and posterior samples on the model parameters for several k.
bfnormmix(x, k=1:2, mu0=rep(0,ncol(x)), g, nu0, S0, q=3, q.niw=1,
B=10^4, burnin= round(B/10), logscale=TRUE, returndraws=TRUE, verbose=TRUE)
A list with elements
Number of components
Posterior probability of k components under a MOM-IW-Dir(q) prior
Posterior probability of k components under a Normal-IW-Dir(q.niw) prior
Posterior probability that any one cluster is empty under a MOM-IW-Dir(q.niw) prior
Bayes factor comparing 1 vs k components under a MOM-IW-Dir(q) prior
log of the posterior mean of the MOM-IW-Dir(q) penalty term
Bayes factor comparing 1 vs k components under a Normal-IW-Dir(q.niw) prior
n x p input data matrix
Number of components
Prior on mu[j] is N(mu0,g Sigma[j])
Prior on mu[j] is N(mu0,g Sigma[j]). This is a critical MOM-IW prior parameter that specifies the separation between components deemed practically relevant. It defaults to assigning 0.95 prior probability to any pair of mu's giving a bimodal mixture, see details
Prior on Sigma[j] is IW(Sigma_j; nu0, S0)
Prior on Sigma[j] is IW(Sigma_j; nu0, S0)
Prior parameter in MOM-IW-Dir(q) prior
Prior parameter in Normal-IW-Dir(q.niw) prior
Number of MCMC iterations
Number of burn-in iterations
If set to TRUE then log-Bayes factors are returned
If set to TRUE
the MCMC posterior draws under
the Normal-IW-Dir prior are returned for all k
Set to TRUE
to print iteration progress
David Rossell
The likelihood is
p(x[i,] | mu,Sigma,eta)= sum_j eta_j N(x[i,]; mu_j,Sigma_j)
The Normal-IW-Dir prior is
Dir(eta; q.niw) prod_j N(mu_j; mu0, g Sigma) IW(Sigma_j; nu0, S0)
The MOM-IW-Dir prior is
$$d(\mu,A) Dir(\eta; q) \prod_j N(\mu_j; \mu0, g \Sigma_j) IW(\Sigma_j; \nu_0, S0)$$
where
$$d(\mu,A)= [\prod_{j<l} (\mu_j-\mu_l)' A (\mu_j-\mu_l)]$$
and A is the average of \(\Sigma_1^{-1},...,\Sigma_k^{-1}\). Note that one must have q>1 for the MOM-IW-Dir to define a non-local prior.
By default the prior parameter g is set such that
P( (mu[j]-mu[l])' A (mu[j]-mu[l]) < 4)= 0.05.
The reasonale when Sigma[j]=Sigma[l] and eta[j]=eta[l] then (mu[j]-mu[l])' A (mu[j]-mu[l])>4 corresponds to a bimodal density. That is, the default g focuses 0.95 prior prob on a degree of separation between components giving rise to a bimodal mixture density.
bfnormmix
computes posterior model probabilities under the
MOM-IW-Dir and Normal-IW-Dir priors using MCMC output. As described in
Fuquene, Steel and Rossell (2018) the estimate is based on the
posterior probability that one cluster is empty under each possible k.
Fuquene J., Steel M.F.J., Rossell D. On choosing mixture components via non-local priors. 2018. arXiv
x <- matrix(rnorm(100*2),ncol=2)
bfnormmix(x=x,k=1:3)
Run the code above in your browser using DataLab