mice.impute.midastouch(y, ry, x, ridge = 1e-05,
midas.kappa = NULL, outout = TRUE, neff = NULL, debug = NULL, ...)
y
(TRUE
=observed,
FALSE
=missing)length(y)
rows and p
columns
containing complete covariates.ridge = 1e-05
, which means that 0.001 percent of the diagonal is added to the cross-product. Larger ridges may result in more biased estimates. For highly noisy data (e.g. many junk variables), set ridge = 1e-06
or even lower to reduce bias. For highly collinear data, set ridge = 1e-04
or higher.NULL
(default) then the optimal kappa
gets selected automatically. Alternatively, the user may specify a scalar. Siddique and Belin 2008 find midas.kappa = 3
to be sensible.TRUE
(default) one model is estimated for each donor (leave-one-out principle). For speedup choose outout = FALSE
, which estimates one model for all observations leading to in-sample predictions for the donors and out-of-sample predictions for the recipients. Mind the inappropriateness, though.midastouch.neff
.midastouch.inputlist
.sum(!ry)
with imputations
y
by predictive mean matching, based on Rubin (1987, p.
168, formulas a and b) and Siddique and Belin 2008. The procedure is as follows:
yobs
(nobs x 1) and ymis
(nmis x nobs).
yobs
and the corresponding ymis
.
y
as the imputation.
Little, R.J.A. (1988), Missing data adjustments in large surveys (with discussion), Journal of Business Economics and Statistics, 6, 287--301.
Parzen, M., Lipsitz, S. R., Fitzmaurice, G. M. (2005), A note on reducing the bias of the approximate bayesian bootstrap imputation variance estimator. Biometrika 92, 4, 971--974.
Rubin, D.B. (1987), Multiple imputation for nonresponse in surveys. New York: Wiley.
Siddique, J., Belin, T.R. (2008), Multiple imputation using an iterative hot-deck with distance-based donor selection. Statistics in medicine, 27, 1, 83--102
Van Buuren, S., Brand, J.P.L., Groothuis-Oudshoorn C.G.M., Rubin, D.B. (2006), Fully conditional specification in multivariate imputation. Journal of Statistical Computation and Simulation, 76, 12, 1049--1064.
Van Buuren, S., Groothuis-Oudshoorn, K. (2011), mice
: Multivariate
Imputation by Chained Equations in R
. Journal of Statistical
Software, 45, 3, 1--67. http://www.jstatsoft.org/v45/i03/
## from R:: mice, slightly adapted ##
# do default multiple imputation on a numeric matrix
library(midastouch)
library(mice)
imp <- mice(nhanes, method = 'midastouch')
imp
# list the actual imputations for BMI
imp$imp$bmi
# first completed data matrix
complete(imp)
# imputation on mixed data with a different method per column
mice(nhanes2, method = c('sample','midastouch','logreg','norm'))
Run the code above in your browser using DataLab