Learn R Programming

mclust (version 5.4.1)

MclustBootstrap: Resampling-based Inference for Gaussian finite mixture models

Description

Bootstrap or jackknife estimation of standard errors and percentile bootstrap confidence intervals for the parameters of a Gaussian mixture model.

Usage

MclustBootstrap(object, nboot = 999, type = c("bs", "wlbs", "pb", "jk"),
                max.nonfit = 10*nboot, verbose = interactive(), 
                …)

Arguments

object

An object of class 'Mclust' or 'densityMclust' providing an estimated Gaussian mixture model.

nboot

The number of bootstrap replications.

type

A character string specifying the type of resampling to use:

"bs" = nonparametric bootstrap
"wlbs" = weighted likelihood bootstrap
"pb" = parametric bootstrap
"jk" = jackknife

max.nonfit

The maximum number of non-estimable models allowed.

verbose

A logical controlling if a text progress bar is displayed during the resampling procedure. By default is TRUE if the session is interactive, and FALSE otherwise.

Further arguments passed to or from other methods.

Value

An object of class 'MclustBootstrap' with the following components:

n

The number of observations in the data.

d

The dimension of the data.

G

A value specifying the number of mixture components.

modelName

A character string specifying the mixture model covariances parameterisation (see mclustModelNames).

parameters

A list of estimated parameters for the mixture components with the following components:

pro

a vector of mixing proportions.

mean

a matrix of means for each component.

variance

an array of covariance matrices for each component.

nboot

The number of bootstrap replications if type = "bs" or type = "wlbs". The sample size if type = "jk".

type

The type of resampling approach used.

nonfit

The number of resamples that did not convergence during the procedure.

pro

A matrix of dimension (nboot x G) containing the bootstrap distribution for the mixing proportion.

mean

An array of dimension (nboot x d x G), where d is the dimension of the data, containing the bootstrap distribution for the component means.

variance

An array of dimension (nboot x d x d x G), where d is the dimension of the data, containing the bootstrap distribution for the component covariances.

Details

For a fitted Gaussian mixture model with object$G mixture components and covariances parameterisation object$modelName, this function returns either the boostrap distribution or the jackknife distribution of mixture parameters. In the former case, the nonparametric bootstrap or the weighted likelihood bootstrap approach could be used, so the the bootstrap procedure generates nboot bootstrap samples of the same size as the original data by resampling with replacement from the observed data. In the jackknife case, the procedure considers all the samples obtained by omitting one observation at time.

The resulting resampling distribution can then be used to obtain standard errors and percentile confidence intervals by the use of summary.MclustBootstrap function.

References

Davison, A. and Hinkley, D. (1997) Bootstrap Methods and Their Applications. Cambridge University Press.

McLachlan, G.J. and Peel, D. (2000) Finite Mixture Models. Wiley.

O'Hagan A., Murphy T. B., Gormley I. C. and Scrucca L. (2015) On Estimation of Parameter Uncertainty in Model-Based Clustering. Submitted to Computational Statistics.

See Also

summary.MclustBootstrap, plot.MclustBootstrap, Mclust, densityMclust.

Examples

Run this code
# NOT RUN {
data(diabetes)
X <- diabetes[,-1]
modClust <- Mclust(X) 
bootClust <- MclustBootstrap(modClust)
summary(bootClust, what = "se")
summary(bootClust, what = "ci")

data(acidity)
modDens <- densityMclust(acidity)
modDens <- MclustBootstrap(modDens)
summary(modDens, what = "se")
summary(modDens, what = "ci")
# }

Run the code above in your browser using DataLab