Learn R Programming

RmixmodCombi (version 1.0)

mixmodCombi: Combining Mixture Components for Clustering

Description

Provides a hierarchy of combined clusterings from the EM/BIC mixture solution provided by Rmixmod to one class, following the methodology proposed in the article cited in the references.

Usage

mixmodCombi(data = NULL, nbCluster = NULL, mixmodOutput = NULL, criterion = c("BIC", "ICL"), ...)

Arguments

data
matrix or data frame containing quantitative or qualitative data. Rows correspond to observations and columns correspond to variables.
nbCluster
numeric listing the numbers of clusters to consider.
mixmodOutput
[MixmodCluster] object, as returned by the mixmodCluster function, containing the optimal mixture (according to BIC) associated to the data in data. Please see the Rmixmod documentation for the details of the components. Default value is NULL, in which case mixmodCluster is called.
criterion
as for the mixmodCluster function, list of characters defining the criterion to select the best model. The best model is the one with the lowest criterion value. Possible values: "BIC", "ICL", "NEC", c("BIC", "ICL", "NEC"). Unlike the mixmodCluster function, the default value is c("BIC", "ICL") and should only be modified with care (the plot and print functions may then wrongly refer to the "BIC" and "ICL" solutions).
...
any optional argument that should be passed to the mixmodCluster function, for example the list of models to consider... Please see the mixmodCluster function documentation.

Value

[MixmodCombi] object:
mixmodOutput
[MixmodCluster] object. EM/BIC solution from which the hierarchy is computed. Either provided by the user or computed by a call to the mixmodCluster function
hierarchy
a list of MixmodCombiSol objects, each of which is the solution for the corresponding number of clusters obtained by hierarchically combining the EM/BIC solution according to the method proposed in the article in the references. Each one contains: the number of cluters, the partition of the data, the posterior probabilities of each class for each observation, the entropy value for the solution and a "combining matrix" combiM which enables to get the K-cluster solution from the (K+1)-cluster solution (please see the combMat function documentation about the combining matrices and how to use them).
ICLNbCluster
number of clusters selected by ICL, according to the mixmodOutput solution (if the criterion option has not been changed).

Details

mixmodCluster provides a mixture fitted to the data by maximum likelihood through the EM algorithm, for the model and number of components selected according to BIC. The corresponding components are hierarchically combined according to an entropy criterion, following the methodology described in the article cited in the references section. The combined clusterings with numbers of classes between the one selected by BIC and one are returned as a [MixmodCombi] object.

References

J.-P. Baudry, A. E. Raftery, G. Celeux, K. Lo and R. Gottardo (2010). Combining mixture components for clustering. Journal of Computational and Graphical Statistics, 19(2):332-353.

Examples

Run this code

##### Example of quantitative data #####

set.seed(1)

data(Baudry_etal_2010_JCGS_examples)
res <- mixmodCombi(ex4.1, nbCluster = 1:8)

res # is of class MixmodCombi

res@mixmodOutput # is the initial EM/BIC solution (provided by mixmodCluster or by the user as a
# [\code{\linkS4class{MixmodCluster}}] object) from which the hierarchy is computed

res@hierarchy[[3]] # is the 3-cluster solution obtained by hierarchically combining the initial
# EM/BIC solution

## Not run: 
# plot(res)
# 
# hist(res, nbCluster = 4)
# ## End(Not run)

##### Example of qualtitative data #####

set.seed(1)

data(car)
res <- mixmodCombi(car[1:300,], nbCluster = 1:10) # Only the 300 first observations for a 
# quick example

res # is of class MixmodCombi

res@mixmodOutput # is the initial EM/BIC solution (provided by mixmodCluster or by the user as a 
# [\code{\linkS4class{MixmodCluster}}] object) from which the hierarchy is computed

res@hierarchy[[res@ICLNbCluster]] # is the solution obtained by hierarchically combining the initial
# EM/BIC solution for the number of clusters selected with ICL

## Not run: plot(res)
# 
# barplot(res)
# ## End(Not run)

Run the code above in your browser using DataLab