mclustModel: Best model based on BIC

Description

Determines the best model from clustering via mclustBIC for a given set of model parameterizations and numbers of components.

Usage

mclustModel(data, BICvalues, G, modelNames, …)

Arguments

data

The matrix or vector of observations used to generate `object'.

BICvalues

An 'mclustBIC' object, which is the result of applying mclustBIC to data.

A vector of integers giving the numbers of mixture components (clusters) from which the best model according to BIC will be selected (as.character(G) must be a subset of the row names of BICvalues). The default is to select the best model for all numbers of mixture components used to obtain BICvalues.

modelNames

A vector of integers giving the model parameterizations from which the best model according to BIC will be selected (as.character(model) must be a subset of the column names of BICvalues). The default is to select the best model for parameterizations used to obtain BICvalues.

…

Not used. For generic/method consistency.

Value

A list giving the optimal (according to BIC) parameters, conditional probabilities z, and log-likelihood, together with the associated classification and its uncertainty.

The details of the output components are as follows:

modelName

A character string indicating the model. The help file for mclustModelNames describes the available models.

The number of observations in the data.

The dimension of the data.

The number of components in the Gaussian mixture model corresponding to the optimal BIC.

bic

The optimal BIC value.

loglik

The log-likelihood corresponding to the optimal BIC.

parameters

A list with the following components:

pro: A vector whose kth component is the mixing proportion for the kth component of the mixture model. If missing, equal proportions are assumed.
mean: The mean for each component. If there is more than one component, this is a matrix whose kth column is the mean of the kth component of the mixture model.
variance: A list of variance parameters for the model. The components of this list depend on the model specification. See the help file for mclustVariance for details.
Vinv: The estimate of the reciprocal hypervolume of the data region used in the computation when the input indicates the addition of a noise component to the model.

A matrix whose [i,k]th entry is the probability that observation i in the test data belongs to the kth class.

References

C. Fraley and A. E. Raftery (2002). Model-based clustering, discriminant analysis, and density estimation. Journal of the American Statistical Association 97:611-631.

C. Fraley, A. E. Raftery, T. B. Murphy and L. Scrucca (2012). mclust Version 4 for R: Normal Mixture Modeling for Model-Based Clustering, Classification, and Density Estimation. Technical Report No. 597, Department of Statistics, University of Washington.

Examples

Run this code

# NOT RUN {
irisBIC <- mclustBIC(iris[,-5])
mclustModel(iris[,-5], irisBIC)
mclustModel(iris[,-5], irisBIC, G = 1:6, modelNames = c("VII", "VVI", "VVV"))
# }