Learn R Programming

mclust (version 2.1-14)

EMclustN: BIC for Model-Based Clustering with Poisson Noise

Description

BIC for EM initialized by hierarchical clustering for parameterized Gaussian mixture models with Poisson noise.

Usage

EMclustN(data, G, emModelNames, noise, hcPairs, eps, tol, itmax, 
         equalPro, warnSingular=FALSE, Vinv, ...)

Arguments

data
A numeric vector, matrix, or data frame of observations. Categorical variables are not allowed. If a matrix or data frame, rows correspond to observations and columns correspond to variables.
G
An integer vector specifying the numbers of MVN (Gaussian) mixture components (clusters) for which the BIC is to be calculated. The default is 0:9 where 0 indicates only a noise component.
emModelNames
A vector of character strings indicating the models to be fitted in the EM phase of clustering. Possible models: "E" for spherical, equal variance (one-dimensional) "V" for spherical, variable variance (one-dimensional) "EII": spherical, equal vo
noise
A logical or numeric vector indicating whether or not observations are initially estimated to noise in the data. If there is no noise EMclust should be use rather than EMclustN.
hcPairs
A matrix of merge pairs for hierarchical clustering such as produced by function hc. The default is to compute a hierarchical clustering tree by applying function hc with modelName = .Mclust$hcModelName[1]
eps
A scalar tolerance for deciding when to terminate computations due to computational singularity in covariances. Smaller values of eps allow computations to proceed nearer to singularity. The default is .Mclust$eps.
tol
A scalar tolerance for relative convergence of the loglikelihood. The default is .Mclust$tol.
itmax
An integer limit on the number of EM iterations. The default is .Mclust$itmax.
equalPro
Logical variable indicating whether or not the mixing proportions are equal in the model. The default is .Mclust$equalPro.
Vinv
An estimate of the reciprocal hypervolume of the data region. The default is determined by applying function hypvol to the data.
warnSingular
A logical value indicating whether or not a warning should be issued whenever a singularity is encountered. The default is warnSingular=FALSE.
...
Provided to allow lists with elements other than the arguments can be passed in indirect or list calls with do.call.

Value

  • Bayesian Information Criterion for the specified mixture models numbers of clusters. Auxiliary information returned as attributes.

References

C. Fraley and A. E. Raftery (2002a). Model-based clustering, discriminant analysis, and density estimation. Journal of the American Statistical Association 97:611-631. See http://www.stat.washington.edu/mclust. C. Fraley and A. E. Raftery (2002b). MCLUST:Software for model-based clustering, density estimation and discriminant analysis. Technical Report, Department of Statistics, University of Washington. See http://www.stat.washington.edu/mclust.

See Also

summary.EMclustN, EMclust, hc, me, mclustOptions

Examples

Run this code
data(iris)
irisMatrix <- as.matrix(iris[,1:4])
irisClass <- iris[,5]

b <- apply( irisMatrix, 2, range)
n <- 450
set.seed(0)
poissonNoise <- apply(b, 2, function(x, n=n) 
                      runif(n, min = x[1]-0.1, max = x[2]+.1), n = n)
set.seed(0)
noiseInit <- sample(c(TRUE,FALSE),size=150+450,replace=TRUE,prob=c(3,1))
Bic <-  EMclustN(data=rbind(irisMatrix, poissonNoise), noise = noiseInit)
Bic
plot(Bic)

Run the code above in your browser using DataLab