Learn R Programming

MSmix (version 2.0.0)

bicMSmix: BIC and AIC for mixtures of Mallows models with Spearman distance

Description

bicMSmix and aicMSmix compute, respectively, the Bayesian Information Criterion (BIC) and the Akaike Information Criterion (AIC) for a mixture of Mallows models with Spearman distance fitted on partial rankings.

Usage

bicMSmix(rho, theta, weights, rankings)

aicMSmix(rho, theta, weights, rankings)

Value

The BIC or AIC value.

Arguments

rho

Integer \(G\)\(\times\)\(n\) matrix with the component-specific consensus rankings in each row.

theta

Numeric vector of \(G\) non-negative component-specific precision parameters.

weights

Numeric vector of \(G\) positive mixture weights (normalization is not necessary).

rankings

Integer \(N\)\(\times\)\(n\) matrix or data frame with partial rankings in each row. Missing positions must be coded as NA.

Details

The (log-)likelihood evaluation is performed by augmenting the partial rankings with the set of all compatible full rankings (see data_augmentation), and then the marginal likelihood is computed.

When \(n\leq 20\), the (log-)likelihood is exactly computed, otherwise it is approximated with the method introduced by Crispino et al. (2023). If \(n>170\), the approximation is also restricted over a fixed grid of values for the Spearman distance to limit computational burden.

References

Crispino M, Mollica C and Modugno L (2025+). MSmix: An R Package for clustering partial rankings via mixtures of Mallows Models with Spearman distance. (submitted)

Crispino M, Mollica C, Astuti V and Tardella L (2023). Efficient and accurate inference for mixtures of Mallows models with Spearman distance. Statistics and Computing, 33(98), DOI: 10.1007/s11222-023-10266-8.

Schwarz G (1978). Estimating the dimension of a model. The Annals of Statistics, 6(2), pages 461–464, DOI: 10.1002/sim.6224.

Sakamoto Y, Ishiguro M, and Kitagawa G (1986). Akaike Information Criterion Statistics. Dordrecht, The Netherlands: D. Reidel Publishing Company.

See Also

likMSmix, data_augmentation

Examples

Run this code

## Example 1. Simulate rankings from a 2-component mixture of Mallows models
## with Spearman distance.
set.seed(12345)
rank_sim <- rMSmix(sample_size = 50, n_items = 12, n_clust = 2)
str(rank_sim)
rankings <- rank_sim$samples
# Fit the true model.
set.seed(12345)
fit <- fitMSmix(rankings = rankings, n_clust = 2, n_start = 10)
# Comparing the BIC at the true parameter values and at the MLE.
bicMSmix(rho = rank_sim$rho, theta = rank_sim$theta, weights = rank_sim$weights,
       rankings = rank_sim$samples)
bicMSmix(rho = fit$mod$rho, theta = fit$mod$theta, weights = fit$mod$weights,
       rankings = rank_sim$samples)
aicMSmix(rho = rank_sim$rho, theta = rank_sim$theta, weights = rank_sim$weights,
       rankings = rank_sim$samples)
aicMSmix(rho = fit$mod$rho, theta = fit$mod$theta, weights = fit$mod$weights,
       rankings = rank_sim$samples)


## Example 2. Simulate rankings from a basic Mallows model with Spearman distance.
set.seed(54321)
rank_sim <- rMSmix(sample_size = 50, n_items = 8, n_clust = 1)
str(rank_sim)
# Let us censor the observations to be top-5 rankings.
rank_sim$samples[rank_sim$samples > 5] <- NA
rankings <- rank_sim$samples
# Fit the true model with the two EM algorithms.
set.seed(54321)
fit_em <- fitMSmix(rankings = rankings, n_clust = 1, n_start = 10)
set.seed(54321)
fit_mcem <- fitMSmix(rankings = rankings, n_clust = 1, n_start = 10, mc_em = TRUE)
# Compare the BIC at the true parameter values and at the MLEs.
bicMSmix(rho = rank_sim$rho, theta = rank_sim$theta, weights = rank_sim$weights,
       rankings = rank_sim$samples)
bicMSmix(rho = fit_em$mod$rho, theta = fit_em$mod$theta, weights = fit_em$mod$weights,
       rankings = rank_sim$samples)
bicMSmix(rho = fit_mcem$mod$rho, theta = fit_mcem$mod$theta, weights = fit_mcem$mod$weights,
       rankings = rank_sim$samples)
aicMSmix(rho = rank_sim$rho, theta = rank_sim$theta, weights = rank_sim$weights,
       rankings = rank_sim$samples)
aicMSmix(rho = fit_em$mod$rho, theta = fit_em$mod$theta, weights = fit_em$mod$weights,
       rankings = rank_sim$samples)
aicMSmix(rho = fit_mcem$mod$rho, theta = fit_mcem$mod$theta, weights = fit_mcem$mod$weights,
       rankings = rank_sim$samples)

Run the code above in your browser using DataLab