fit_incremental_angmix: Stepwise fitting of angular mixture models with incremental component sizes and optimum model selection

Description

Stepwise fitting of angular mixture models with incremental component sizes and optimum model selection

Usage

fit_incremental_angmix(
  model,
  data,
  crit = "LOOIC",
  start_ncomp = 1,
  max_ncomp = 10,
  L = NULL,
  fn = mean,
  fix_label = NULL,
  form = 2,
  start_par = NULL,
  prev_par = TRUE,
  logml_maxiter = 10000,
  return_all = FALSE,
  save_fits = FALSE,
  save_file = NULL,
  save_dir = "",
  silent = FALSE,
  return_llik_contri = (crit %in% c("LOOIC", "WAIC")),
  use_best_chain = TRUE,
  alpha = 0.05,
  bonferroni_alpha = TRUE,
  bonferroni_adj_type = "decreasing",
  ...
)

Value

Returns a named list (with class = stepfit) with the following seven elements:

fit.all (if return_all = TRUE) - a list all angmcmc objects created at each component size;

fit.best - angmcmc object corresponding to the optimum component size;

ncomp.best - optimum component size (integer);

crit - which model comparison criterion used (one of "LOOIC", "WAIC", "AIC", "BIC", "DIC" or "LOGML");

crit.all - all crit values calculated (for all component sizes);

crit.best - crit value for the optimum component size; and

maxllik.all - maximum (obtained from MCMC iterations) log likelihood for all fitted models

maxllik.best - maximum log likelihodd for the optimal model; and

check_min - logical; is the optimum component size less than max_ncomp?

Arguments

model: angular model whose mixtures are to be fitted. Available choices are "vmsin", "vmcos" and "wnorm2" for bivariate data, and "vm" and "wnorm" for univariate data.
data: data matrix (if bivarate, in which case it must have two columns) or vector. If outside, the values are transformed into the scale \([0, 2\pi)\). *Note:* BAMBI cannot handle missing data. Missing values must either be removed or properly imputed.
crit: model selection criteria, one of "LOOIC", "WAIC", "AIC", "BIC", "DIC" or "LOGML". Default is "LOOIC".
start_ncomp: starting component size. A single component model is fitted if start_ncomp is equal to one.
max_ncomp: maximum number of components allowed in the mixture model.
L: HMC tuning parameter (trajectory length) passed to fit_angmix. Can be a numeric vetor (or scalar), in which case the same L is passed to all fit_angmix calls, or can be a list of length max_ncomp-start_ncomp+1, so that L_list[[i]] is passed as the argument L to fit_angmix call with ncomp = max_ncomp+i-1. See fit_angmix for more details on L including its default values. Ignored if method = "rwmh".
fn: function to evaluate on MCMC samples to estimate parameters. Defaults to mean, which computes the estimated posterior means. If fn = max, then MAP estimate is calculated from the MCMC run. Used only if crit = "DIC", and ignored otherwise.
fix_label: logical. Should the label switchings on the current fit (only the corresponding "best chain" if use_best_chain = TRUE) be fixed before computing parameter estimates and model selection criterion? Defaults to TRUE if perm_sampling is true in the fit_angmix call, or if crit = "DIC" and form = 1.
form: form of crit to be used. Available choices are 1 and 2. Used only if crit is "DIC" and ignored otherwise.
start_par: list with elements pmix (ignored if comp == 1), together with kappa1, kappa2, mu1 and mu2, for bivariate models, and kappa and mu for univariate models, all being vectors of length same as ncomp. These provides the starting values for the Markov chain; with \(j\)-th component of each vector corresponding to the \(j\)-th component of the mixture distribution. If missing, the data is first clustered into ncomp groups either via k-means (after projecting onto a unit sphere), or randomly, depending on rand_start, and then moment estimators for components are used as the starting points. Note that a very wrong starting point can potentially lead the chain to get stuck at a wrong solution for thousands of iterations. As such, we recommend using the default option, which is k-means followed by moment estimation.
prev_par: logical. Should the MAP estimated parameters from the model with ncomp = K be used in the model with ncomp = K+1 as the starting parameters, with the component with largest mixing proportion appearing twice in the bigger model?
logml_maxiter: maximum number of iterations (maxiter) passed to bridge_sampler for calculating LOGML. Ignored if crit is not LOGML.
return_all: logical. Should all angmcmc objects obtained during step-wise run be returned? *Warning*: depending on the sizes of n.iter, start_ncomp, max_ncomp and n.chains, this can be very memory intesive. In such cases, it is recommended that return_all be set to FALSE, and, if required, the intermediate fitted objects be saved to file by setting save_fits = TRUE.
save_fits: logical. Should the intermediate angmcmc objects obtained during step-wise run be saved to file using save? Defaults to TRUE. See save_file and save_dir.
save_file, save_dir: save_file is a list of size max_ncomp-start_ncomp+1, with k-th entry providing the file argument used to save the intermediate angmcmc object with ncomp = k (titled "fit_angmcmc"). If not provided, then k-th element of save_file[[k]] is taken to be paste(save_dir, "comp_k", sep="/"). Both are ignored if save_fits = FALSE.
silent: logical. Should the current status (such as what is the current component labels, which job is being done etc.) be printed? Defaults to TRUE.
return_llik_contri: passed to fit_angmix. By default, set to TRUE if crit is either "LOOIC" or "WAIC", and to FALSE otherwise.
use_best_chain: logical. Should only the "best" chain obtained during each intermediate fit be used during computation of model selection criterion? Here "best" means the chain with largest (mean over iterations) log-posterior density. This can be helpful if one of the chains gets stuck at local optima. Defaults to TRUE.
alpha: significance level used in the test H_0K: expected log predictive density (elpd) for the fitted model with K components >= elpd for the fitted model with K + 1 components if crit is "LOOIC" or "WAIC". Must be a scalar between 0 and 1. Defaults to 0.05. See Details. Ignored for any other crit.
bonferroni_alpha: logical. Should a Bonferroni correction be made on the test size alpha to adjust for multiplicity due to (max_ncomp - start_ncomp) possible hypothesis tests? Defaults to TRUE. Relevant only if crit is in c("LOOIC", "WAIC"), and ignored otherwise. See Details.
bonferroni_adj_type: character string. Denoting type of Bonferroni adjustment to make. Possible choices are "decreasing" (default) and "equal". Ignored if either bonferroni_alpha is FALSE, or crit is outside c("LOOIC", "WAIC"). See Details.
...: additional arguments passed to fit_angmix.

Details

The goal is to fit an angular mixture model with an optimally chosen component size K. To obtain an optimum K, mixture models with incremental component sizes between start_ncomp and max_ncomp are fitted incrementally using fit_angmix, starting from K = 1. If the model selection criterion crit is "LOOIC" or "WAIC", then a test of hypothesis H_0K: expected log predictive density (elpd) for the fitted model with K components >= elpd for the fitted model with K + 1 components, is performed at every K >= 1. The test-statistic used for the test is an approximate z-score based on the normalized estimated elpd difference between the two models obtained from compare, which provides estimated elpd difference along with its standard error estimate. Because the computed standard error of elpd difference can be overly optimistic when the elpd difference is small (in particular < 4), a conservative worst-case estimate (equal to twice of the computed standard error) is used in such cases. To account for multiplicity among the M = (max_ncomp - start_ncomp) possible sequential tests performed, by default a Bonferroni adjustment to the test level alpha is made. Set bonferroni_alpha = FALSE to remove the adjustment. To encourage parsimony in the final model, by default (bonferroni_adj_type = "decreasing") a decreasing sequence of adjusted alphas of the form alpha * (0.5)^(1:M) / sum((0.5)^(1:M)) is used. Set bonferroni_adj_type = "equal" to use equal sequence of adjusted alphas (i.e., alpha/M) instead.

The incremental fitting stops if H_0K cannot be rejected (at level alpha) for some K >= 1; this K is then regarded as the optimum number of components. If crit is not "LOOIC" or "WAIC" then mixture model with the first minimum value of the model selection criterion crit is taken as the best model.

Note that in each intermediate fitted model, the total number of components (instead of the number of "non-empty components") in the model is used to estimate of the true component size, and then the fitted model is penalized for model complexity (via the model selection criterion used). This approach of selecting an optimal K follows the perspective "let two component specific parameters be identical" for overfitting mixtures, and as such the Dirichlet prior hyper-parameters pmix.alpha (passed to fit_angmix) should be large. See Fruhwirth-Schnatter (2011) for more deltails.

Note that the stability of bridge_sampler used in marginal likelihood estimation heavily depends on stationarity of the chains. As such, while using this criterion, we recommending running the chain long engouh, and setting fix_label = TRUE for optimal performance.

References

Fruhwirth-Schnatter, S.: Label switching under model uncertainty. In: Mengerson, K., Robert, C., Titterington, D. (eds.) Mixtures: Estimation and Application, pp. 213-239. Wiley, New York (2011).

Examples

Run this code

# illustration only - more iterations needed for convergence
set.seed(1)
fit.vmsin.step.15 <- fit_incremental_angmix("vmsin", tim8, "BIC", start_ncomp = 1,
                                          max_ncomp = 3, n.iter = 15,
                                          n.chains = 1, save_fits=FALSE)
(fit.vmsin.best.15 <- bestmodel(fit.vmsin.step.15))
lattice::densityplot(fit.vmsin.best.15)

Run the code above in your browser using DataLab