impKNNa(x, method = "knn", k = 3, metric = "Aitchison", agg = "median",
primitive = FALSE, normknn = TRUE, das = FALSE, adj="median")
metric
should be chosen when dealing with compositional data, the Euclidean metric
otherwise. If primitive
$==$ FALSE, a sequential search for the $k$-nearest neighbors
is applied for every missing value where all information corresponding to the
non-missing cells plus the information in the variable to be imputed plus some
additional information is available. If primitive
$==$ TRUE, a search of the
$k$-nearest neighbors among observations is applied where in addition to the variable
to be imputed any further cells are non-missing.
If normknn
is TRUE (prefered option) the imputed cells from a nearest neighbor method are adjusted with special adjustment factors (more details can be found online (see the references)).
Hron, K. and Templ, M. and Filzmoser, P. (2010) Imputation of missing values for compositional data using classical and robust methods Computational Statistics and Data Analysis, vol 54 (12), pages 3095-3107.
impCoda
data(expenditures)
x <- expenditures
x[1,3]
x[1,3] <- NA
xi <- impKNNa(x)$xImp
xi[1,3]
Run the code above in your browser using DataLab