Generates a synthetic categorical variable using unordered polytomous regression (without or with bootstrap).
syn.polyreg(y, x, xp, proper = FALSE, maxit = 1000, trace = FALSE,
MaxNWts = 10000, ...)
A list with two components:
a vector of length k
with synthetic values of y
.
a summary of the model fitted to the observed data and used to produce synthetic values.
an original data vector of length n
.
a matrix (n
x p
) of original covariates.
a matrix (k
x p
) of synthesised covariates.
for proper synthesis (proper = TRUE
)
a multinomial model is fitted to a bootstrapped sample of the original data.
the maximum number of iterations for nnet
.
switch for tracing optimization for nnet
.
the maximum allowable number of weights for nnet
.
additional parameters passed to nnet
.
Generates synthetic categorical variables by the polytomous regression model. The method consists of the following steps:
Fit categorical response as a multinomial model.
Compute predicted categories.
Add appropriate noise to predictions.
The algorithm of syn.polyreg
uses the function
multinom
from the nnet package. Any numerical
variables are scaled to cover the range (0,1) before fitting. Warnings
are printed if the algorithm fails to converge in maxit
iterations
and also if the synthesised data has only one category. The latter may occur
if the variable being synthesised is sparse so that the algorithm fails to
iterate.
In order to avoid bias due to perfect prediction, the data are augmented by the method of White, Daniel and Royston (2010).
NOTE that when the function is called by setting elements of method in syn()
to "polyreg"
, the parameters maxit
, trace
and MaxNWts
can be supplied to syn()
as e.g. polyreg.maxit
.
White, I.R., Daniel, R. and Royston, P. (2010). Avoiding bias due to perfect prediction in multiple imputation of incomplete categorical variables. Computational Statistics and Data Analysis, 54, 2267--2275.