Learn R Programming

⚠️There's a newer version (1.5) of this package.Take me there.

clusterMI (version 1.4.0)

Cluster Analysis with Missing Values by Multiple Imputation

Description

Allows clustering of incomplete observations by addressing missing values using multiple imputation. For achieving this goal, the methodology consists in three steps, following Audigier and Niang 2022 . I) Missing data imputation using dedicated models. Four multiple imputation methods are proposed, two are based on joint modelling and two are fully sequential methods, as discussed in Audigier et al. (2021) . II) cluster analysis of imputed data sets. Six clustering methods are available (distances-based or model-based), but custom methods can also be easily used. III) Partition pooling. The set of partitions is aggregated using Non-negative Matrix Factorization based method. An associated instability measure is computed by bootstrap (see Fang, Y. and Wang, J., 2012 ). Among applications, this instability measure can be used to choose a number of clusters with missing values. The package also proposes several diagnostic tools to tune the number of imputed data sets, to tune the number of iterations in fully sequential imputation, to check the fit of imputation models, etc.

Copy Link

Version

Install

install.packages('clusterMI')

Monthly Downloads

491

Version

1.4.0

License

GPL-2 | GPL-3

Maintainer

Vincent Audigier

Last Published

February 12th, 2025

Functions in clusterMI (1.4.0)

onefold.chooser

one fold cross-validation for specifying threshold r
overimpute

Overimputation diagnostic plot
myem.mix

internal function
choosem

Graphical investigation for the number of datasets generated by multiple imputation
prodna

Introduce missing values using a missing completely at random mechanism
imputedata

Multiple imputation methods for cluster analysis
choosemaxit

Diagnostic plot for the number of iterations used in sequential imputation methods
initfastnmf.intern

initialize fastnmf
varselbest

Variable selection for specifying conditional imputation models
wine

Chemical analysis of wines from three different cultivars
clusterMI

Cluster analysis and pooling after multiple imputation
fastnmf

Consensus clustering using non-negative matrix factorization
choosenbclust

Tune the number of clusters according to the partition instability
Rcpp_modelobject-class

Class "Rcpp_modelobject"
chooser

Kfold cross-validation for specifying threshold r
clusterMI-package

clusterMI: Cluster Analysis with Missing Values by Multiple Imputation
chooseB

Diagnostic plot for the number of iterations used in the varselbest function
chooseB.intern

Tune the number of iterations for variable selection using varselbest
cluster.intern

Apply clustering method after multiple imputation
mclustboot.intern

MclustBootstrap with nboot = 1 and the same output as Mclust
Silhouette.intern

Compute Silhouette index