allocationSamplerBinMix: The allocation sampler algorithm.

Description

This function implements the collapsed allocation sampler of Nobile and Fearnside (2007) at the context of mixtures of multivariate Bernoulli distributions.

Usage

allocationSamplerBinMix(Kmax, alpha, beta, gamma, m, burn, data, 
thinning, z.true, ClusterPrior, ejectionAlpha, Kstart, outputDir, 
metropolisMoves, reorderModels, heat, zStart, LS, rsX, originalX, printProgress)

Arguments

Kmax

Maximum number of clusters (integer, at least equal to two).

alpha

First shape parameter of the Beta prior distribution (strictly positive). Defaults to 1.

beta

Second shape parameter of the Beta prior distribution (strictly positive). Defaults to 1.

gamma

Kmax-dimensional vector (positive) corresponding to the parameters of the Dirichlet prior of the mixture weights. Default value: rep(1,Kmax).

Number of MCMC iterations.

burn

The number of initial MCMC iterations that will be discarded as burn-in period.

data

Binary data array (NAs not allowed here).

thinning

Integer that defines a thinning of the reported MCMC sample. Under the default setting, every 5th MCMC iteration is saved.

z.true

An optional vector of cluster assignments considered as the ground-truth clustering of the observations. Useful for simulations.

ClusterPrior

Character string specifying the prior distribution of the number of clusters on the set \(\{1,\ldots,K_{max}\}\). Available options: poisson or uniform. It defaults to the truncated Poisson distribution.

ejectionAlpha

Probability of ejecting an empty component. Defaults to 0.2.

Kstart

Initial value for the number of clusters. Defaults to 1.

outputDir

The name of the produced output folder.

metropolisMoves

A vector of character strings with possible values M1, M2, M3, M4. Each entry specifies Metropolis-Hastings type moves on the latent allocation variables.

reorderModels

Character string specifying whether to post-process the MCMC sample of each distinct generated value of K. The default setting is onlyMAP and in this case only the part of the MCMC sample that corresponds to the most probable number of clusters is reordered.

heat

The temperature of the simulated chain, that is, a scalar in the set \((0,1]\).

zStart

\(n\)-dimensional integer vector of latent allocations to initialize the sampler.

Boolean indicating whether to post-process the MCMC sample using the label switching algorithms.

rsX

Optional vector containing the row-sums of the observed data (NAs are allowed). It is required only in the case of missing values.

originalX

Optional array containing the observed data (containing NAs). It is required only in the case of missing values.

printProgress

Logical, indicating whether to print the progress of the sampler or not. Default: FALSE.

Details

The output is reordered according to the following label-switching solving algorithms: ECR, ECR-ITERATIVE-1 and STEPHENS. In most cases the results of these different algorithms are identical.

References

Nobile A and Fearnside A (2007): Bayesian finite mixtures with an unknown number of components: The allocation sampler. Statistics and Computing, 17(2): 147-162.

Papastamoulis P. and Iliopoulos G. (2010). An artificial allocations based solution to the label switching problem in Bayesian analysis of mixtures of distributions. Journal of Computational and Graphical Statistics, 19: 313-331.

Papastamoulis P. and Iliopoulos G. (2013). On the convergence rate of Random Permutation Sampler and ECR algorithm in missing data models. Methodology and Computing in Applied Probability, 15(2): 293-304.

Papastamoulis P. (2014). Handling the label switching problem in latent class models via the ECR algorithm. Communications in Statistics, Simulation and Computation, 43(4): 913-927.

Papastamoulis P (2016): label.switching: An R package for dealing with the label switching problem in MCMC outputs. Journal of Statistical Software, 69(1): 1-24.

Description

Usage

Arguments

Details

References

See Also