Learn R Programming

mice (version 2.7)

mice.impute.polyreg: Imputation by Polytomous Regression

Description

Imputes missing data in a categorical variable using polytomous regression

Usage

mice.impute.polyreg(y, ry, x, nnet.maxit=100, nnet.trace=FALSE, nnet.maxNWts=1500, ...)
mice.impute.polr(y, ry, x, nnet.maxit=100, nnet.trace=FALSE, nnet.maxNWts=1500, ...)

Arguments

y
Incomplete data vector of length n
ry
Vector of missing data pattern (FALSE=missing, TRUE=observed)
x
Matrix (n x p) of complete covariates.
nnet.maxit
Tuning parameter for nnet().
nnet.trace
Tuning parameter for nnet().
nnet.maxNWts
Tuning parameter for nnet().
...
Other named arguments.

Value

  • A vector of length nmis with imputations.

Details

By default, factors with more than two levels are imputed by mice.impute.polyreg (for unordered factors) and mice.impute.polr (for ordered factors). The function mice.impute.polyreg imputation for categorical response variables by the Bayesian polytomous regression model. See J.P.L. Brand (1999), Chapter 4, Appendix B. The method consists of the following steps:
  1. Fit categorical response as a multinomial model
  2. Compute predicted categories
  3. Add appropriate noise to predictions.
The algorithm of mice.impute.polyreg uses the function multinom() from the nnet package. The function mice.impute.polr imputes for ordered categorical response variables by the proportional odds logistic regression (polr) model. The function repeatedly applies logistic regression on the successive splits. The model is also known as the cumulative link model. The algorithm of mice.impute.polr uses the function polr() from the MASS package. In order to avoid bias due to perfect prediction, both algorithms augment the data according to the method of White, Daniel and Royston (2010). The call to polr might fail, usually because the data are very sparse. In that case, multinom is tried as a fallback, and a record is written to the loggedEvents component of the mids object.

References

Van Buuren, S., Groothuis-Oudshoorn, K. (2010) MICE: Multivariate Imputation by Chained Equations in R. Journal of Statistical Software, forthcoming. http://www.stefvanbuuren.nl/publications/MICE in R - Draft.pdf Brand, J.P.L. (1999) Development, implementation and evaluation of multiple imputation strategies for the statistical analysis of incomplete data sets. Dissertation. Rotterdam: Erasmus University. White, I.R., Daniel, R. Royston, P. (2010). Avoiding bias due to perfect prediction in multiple imputation of incomplete categorical variables. Computational Statistics and Data Analysis, 54, 2267-2275. Venables, W.N. & Ripley, B.D. (2002). Modern applied statistics with S-Plus (4th ed). Springer, Berlin.

See Also

mice, multinom, polr