The PLMIX package for R provides functions to fit and analyze finite mixtures of Plackett-Luce models for partial top rankings/orderings within the Bayesian framework. It provides MAP point estimates via EM algorithm and posterior MCMC simulations via Gibbs Sampling. It also fits MLE as a special case of the noninformative Bayesian analysis with vague priors.
In addition to inferential techniques, the package assists other fundamental phases of a model-based analysis for partial rankings/orderings, by including functions for data manipulation, simulation, descriptive summary, model selection and goodness-of-fit evaluation.
Specific S3 classes and methods are also supplied to enhance the usability and foster exchange with other packages. Finally, to address the issue of computationally demanding procedures typical in ranking data analysis, PLMIX takes advantage of a hybrid code linking the R environment with the C++ programming language.
The Plackett-Luce model is one of the most popular and frequently applied parametric distributions to analyze partial top rankings/orderings of a finite set of items. The present package allows to account for unobserved sample heterogeneity of partially ranked data with a model-based analysis relying on Bayesian finite mixtures of Plackett-Luce models. The package provides a suite of functions that covers the fundamental phases of a model-based analysis:
Ranking data manipulation
binary_group_ind
Binary group membership matrix from the mixture component labels.
freq_to_unit
From the frequency distribution to the dataset of individual orderings/rankings.
make_complete
Random completion of partial orderings/rankings data.
make_partial
Censoring of complete orderings/rankings data.
rank_ord_switch
From rankings to orderings and vice-versa.
unit_to_freq
From the dataset of individual orderings/rankings to the frequency distribution.
Ranking data simulation
rPLMIX
Random sample from a finite mixture of Plackett-Luce models.
Ranking data description
paired_comparisons
Paired comparison frequencies.
rank_summaries
Summary statistics of partial ranking/ordering data.
Model estimation
gibbsPLMIX
Bayesian analysis with MCMC posterior simulation via Gibbs sampling.
label_switchPLMIX
Label switching adjustment of the Gibbs sampling simulations.
likPLMIX
Likelihood evaluation for a mixture of Plackett-Luce models.
loglikPLMIX
Log-likelihood evaluation for a mixture of Plackett-Luce models.
mapPLMIX
MAP estimation via EM algorithm.
mapPLMIX_multistart
MAP estimation via EM algorithm with multiple starting values.
Class coercion and membership
as.top_ordering
Coercion into top-ordering datasets.
gsPLMIX_to_mcmc
From the Gibbs sampling simulation to an MCMC class object.
is.top_ordering
Test for the consistency of input data with a top-ordering dataset.
S3 class methods
plot.gsPLMIX
Plot of the Gibbs sampling simulations.
plot.mpPLMIX
Plot of the MAP estimates.
print.gsPLMIX
Print of the Gibbs sampling simulations.
print.mpPLMIX
Print of the MAP estimation algorithm.
summary.gsPLMIX
Summary of the Gibbs sampling procedure.
summary.mpPLMIX
Summary of the MAP estimation.
Model selection
bicPLMIX
BIC value for the MLE of a mixture of Plackett-Luce models.
selectPLMIX
Bayesian model selection criteria.
Model assessment
ppcheckPLMIX
Posterior predictive diagnostics.
ppcheckPLMIX_cond
Posterior predictive diagnostics conditionally on the number of ranked items.
Datasets
d_apa
American Psychological Association Data (partial orderings).
d_carconf
Car Configurator Data (partial orderings).
d_dublinwest
Dublin West Data (partial orderings).
d_gaming
Gaming Platforms Data (complete orderings).
d_german
German Sample Data (complete orderings).
d_nascar
NASCAR Data (partial orderings).
d_occup
Occupation Data (complete orderings).
d_rice
Rice Voting Data (partial orderings).
Data have to be supplied as an object of class matrix
, where missing positions/items are denoted with zero entries and Rank = 1 indicates the most-liked alternative. For a more efficient implementation of the methods, partial sequences with a single missing entry should be preliminarily filled in, as they correspond to complete rankings/orderings. In the present setting, ties are not allowed. Some quantities frequently recalled in the manual are the following:
Sample size.
Number of possible items.
Number of mixture components.
Size of the final posterior MCMC sample (after burn-in phase).
Mollica, C. and Tardella, L. (2017). Bayesian Plackett-Luce mixture models for partially ranked data. Psychometrika, 82(2), pages 442--458, ISSN: 0033-3123, http://dx.doi.org/10.1007/s11336-016-9530-0.
Mollica, C. and Tardella, L. (2014). Epitope profiling via mixture modeling for ranked data. Statistics in Medicine, 33(21), pages 3738--3758, ISSN: 0277-6715, http://onlinelibrary.wiley.com/doi/10.1002/sim.6224/full.