Learn R Programming

mdatools (version 0.14.1)

pca.mvreplace: Replace missing values in data

Description

pca.mvreplace is used to replace missing values in a data matrix with approximated by iterative PCA decomposition.

Usage

pca.mvreplace(
  x,
  center = TRUE,
  scale = FALSE,
  maxncomp = 10,
  expvarlim = 0.95,
  covlim = 10^-6,
  maxiter = 100
)

Value

Returns the same matrix x where missing values are replaced with approximated.

Arguments

x

a matrix with data, containing missing values.

center

logical, do centering of data values or not.

scale

logical, do standardization of data values or not.

maxncomp

maximum number of components in PCA model.

expvarlim

minimum amount of variance, explained by chosen components (used for selection of optimal number of components in PCA models).

covlim

convergence criterion.

maxiter

maximum number of iterations if convergence criterion is not met.

Author

Sergey Kucheryavskiy (svkucheryavski@gmail.com)

Details

The function uses iterative PCA modeling of the data to approximate and impute missing values. The result is most optimal for data sets with low or moderate level of noise and with number of missing values less than 10% for small dataset and up to 20% for large data.

References

Philip R.C. Nelson, Paul A. Taylor, John F. MacGregor. Missing data methods in PCA and PLS: Score calculations with incomplete observations. Chemometrics and Intelligent Laboratory Systems, 35 (1), 1996.

Examples

Run this code
library(mdatools)

## A very simple example of imputing missing values in a data with no noise

# generate a matrix with values
s = 1:6
odata = cbind(s, 2*s, 4*s)

# make a matrix with missing values
mdata = odata
mdata[5, 2] = mdata[2, 3] = NA

# replace missing values with approximated
rdata = pca.mvreplace(mdata, scale = TRUE)

# show all matrices together
show(cbind(odata, mdata, round(rdata, 2)))

Run the code above in your browser using DataLab