Learn R Programming

missCompare (version 1.0.3)

Intuitive Missing Data Imputation Framework

Description

Offers a convenient pipeline to test and compare various missing data imputation algorithms on simulated and real data. These include simpler methods, such as mean and median imputation and random replacement, but also include more sophisticated algorithms already implemented in popular R packages, such as 'mi', described by Su et al. (2011) ; 'mice', described by van Buuren and Groothuis-Oudshoorn (2011) ; 'missForest', described by Stekhoven and Buhlmann (2012) ; 'missMDA', described by Josse and Husson (2016) ; and 'pcaMethods', described by Stacklies et al. (2007) . The central assumption behind 'missCompare' is that structurally different datasets (e.g. larger datasets with a large number of correlated variables vs. smaller datasets with non correlated variables) will benefit differently from different missing data imputation algorithms. 'missCompare' takes measurements of your dataset and sets up a sandbox to try a curated list of standard and sophisticated missing data imputation algorithms and compares them assuming custom missingness patterns. 'missCompare' will also impute your real-life dataset for you after the selection of the best performing algorithm in the simulations. The package also provides various post-imputation diagnostics and visualizations to help you assess imputation performance.

Copy Link

Version

Install

install.packages('missCompare')

Monthly Downloads

256

Version

1.0.3

License

MIT + file LICENSE

Maintainer

Tibor V. Varga

Last Published

December 1st, 2020

Functions in missCompare (1.0.3)

MNAR

Missing data spike-in in MNAR pattern
MCAR

Missing data spike-in in MCAR pattern
get_data

Extraction of metadata from dataframes
impute_data

Missing data imputation with various methods
impute_simulated

Imputation algorithm tester on simulated data
clindata_miss

Clinical dataset with missingness
clean

Dataframe cleaning for missing data handling
MAP

Missing data spike-in in MAP pattern
all_patterns

Missing data spike-in in various missing data patterns
MAR

Missing data spike-in in MAR pattern
test_aregImpute

Testing the 'Hmisc' aregImpute missing data imputation algorithm
test_kNN

Testing the 'VIM' kNN missing data imputation algorithm
test_pcaMethods_BPCA

Testing the 'pcaMethods' BPCA missing data imputation algorithm
test_missMDA_reg

Testing the 'missMDA' regularized missing data imputation algorithm
simulate

Simulation of matrix with no missingness
test_median_imp

Testing the median imputation algorithm
test_mean_imp

Testing the mean imputation algorithm
test_pcaMethods_Nipals

Testing the 'pcaMethods' NIPALS missing data imputation algorithm
test_pcaMethods_PPCA

Testing the 'pcaMethods' PPCA missing data imputation algorithm
test_pcaMethods_NLPCA

Testing the 'pcaMethods' NLPCA missing data imputation algorithm
test_mi

Testing the 'mi' missing data imputation algorithm
test_mice_mixed

Testing the 'mice' mixed missing data imputation algorithm
test_pcaMethods_svdImpute

Testing the 'pcaMethods' svdImpute missing data imputation algorithm
test_AmeliaII

Testing the 'Amelia II' missing data imputation algorithm
test_random_imp

Testing the random replacement imputation algorithm
post_imp_diag

Post imputation diagnostics
test_missMDA_EM

Testing the 'missMDA' EM missing data imputation algorithm
missCompare

'missCompare': Missing Data Imputation Comparison Framework
test_missForest

Testing the 'missForest' missing data imputation algorithm