boot.comp: Performs Parametric Bootstrap for Sequentially Testing the Number of Components in Various Mixture Models

Description

Performs a parametric bootstrap by producing B bootstrap realizations of the likelihood ratio statistic for testing the null hypothesis of a k-component fit versus the alternative hypothesis of a (k+1)-component fit to various mixture models. This is performed for up to a specified number of maximum components, k. A p-value is calculated for each test and once the p-value is above a specified significance level, the testing terminates. An optional histogram showing the distribution of the likelihood ratio statistic along with the observed statistic can also be produced.

Usage

boot.comp(y, x = NULL, N = NULL, max.comp = 2, B = 100,
          sig = 0.05, arbmean = TRUE, arbvar = TRUE,
          mix.type = c("logisregmix", "multmix", "mvnormalmix",
          "normalmix", "poisregmix", "regmix", "regmix.mixed", 
          "repnormmix"), hist = TRUE, ...)

Value

boot.comp returns a list with items:

p.values: The p-values for each test of k-components versus (k+1)-components.
log.lik: The B bootstrap realizations of the likelihood ratio statistic.
obs.log.lik: The observed likelihood ratio statistic for each test which is used in determining the p-values.

Arguments

y: The raw data for multmix, mvnormalmix, normalmix, and repnormmix and the response values for logisregmix, poisregmix, and regmix. See the documentation concerning their respective EM algorithms for specific structure of the raw data.
x: The predictor values required only for the regression mixtures logisregmix, poisregmix, and regmix. A column of 1s for the intercept term must not be included! See the documentation concerning their respective EM algorithms for specific structure of the predictor values.
N: An n-vector of number of trials for the logistic regression type logisregmix. If NULL, then N is an n-vector of 1s for binary logistic regression.
max.comp: The maximum number of components to test for. The default is 2. This function will perform a test of k-components versus (k+1)-components sequentially until we fail to reject the null hypothesis. This decision rule is governed by the calculated p-value and sig.
B: The number of bootstrap realizations of the likelihood ratio statistic to produce. The default is 100, but ideally, values of 1000 or more would be more acceptable.
sig: The significance level for which to compare the p-value against when performing the test of k-components versus (k+1)-components.
arbmean: If FALSE, then a scale mixture analysis can be performed for mvnormalmix, normalmix, regmix, or repnormmix. The default is TRUE.
arbvar: If FALSE, then a location mixture analysis can be performed for mvnormalmix, normalmix, regmix, or repnormmix. The default is TRUE.
mix.type: The type of mixture analysis you wish to perform. The data inputted for y and x depend on which type of mixture is selected. logisregmix corresponds to a mixture of logistic regressions. multmix corresponds to a mixture of multinomials with data determined by the cut-point method. mvnormalmix corresponds to a mixture of multivariate normals. normalmix corresponds to a mixture of univariate normals. poisregmix corresponds to a mixture of Poisson regressions. regmix corresponds to a mixture of regressions with normal components. regmix.mixed corresponds to a mixture of regressions with random or mixed effects. repnormmix corresponds to a mixture of normals with repeated measurements.
hist: An argument to provide a matrix plot of histograms for the boostrapped likelihood ratio statistic.
...: Additional arguments passed to the various EM algorithms for the mixture of interest.

References

McLachlan, G. J. and Peel, D. (2000) Finite Mixture Models, John Wiley and Sons, Inc.

Examples

Run this code

## Bootstrapping to test the number of components on the RTdata.

data(RTdata)
set.seed(100)
x <- as.matrix(RTdata[, 1:3])
y <- makemultdata(x, cuts = quantile(x, (1:9)/10))$y
a <- boot.comp(y = y, max.comp = 1, B = 5, mix.type = "multmix", 
               epsilon = 1e-3)
a$p.values