Impute missing values in a data frame or a matrix using a hot deck within imputation classes
impute_hot_deck_in_classes(
ds,
cols_class,
type = "cols_seq",
breaks = Inf,
use_quantiles = FALSE,
min_objs_in_class = 1,
min_obs_comp = 0,
min_obs_per_col = 1,
donor_limit = Inf,
add_imputation_classes = FALSE
)
A data frame or matrix with missing values.
Columns that are used for constructing the imputation classes.
The type of hot deck (for details, see impute_sRHD()
).
Number of intervals / levels a column is broken into (see
cut()
, which is used internally for cutting numeric columns). If breaks = Inf
(the default), every unique value of a column can be in a separate
class (if no other restrictions apply).
Should quantiles be used for cutting numeric vectors?
Normally, cut()
divides the range of an vector into equal spaced
intervals. If use_quantiles = TRUE
, the classes will be of roughly equal
content.
Minimum number of objects (rows) in an imputation class.
Minimum number of completely observed objects (rows) in an imputation class.
Minimum number of observed values in every column of an imputation class.
Minimum odds between incomplete and complete values in a
column, if type = cols_seq
. If type = sim_comp
, minimum odds between
incomplete and complete rows. For type = sim_part
the donor limit option
is not implemented and donor_limit
should be Inf
.
Should imputation classes be added as attributes to the imputed dataset?
An object of the same class as ds
with imputed missing values.
This function is a combination of impute_in_classes()
and impute_sRHD()
.
It applies impute_sRHD()
inside of imputation classes (adjustment cells),
which are constructed via impute_in_classes()
. More details can be found in
these two functions.
Andridge, R.R. and Little, R.J.A. (2010), A Review of Hot Deck Imputation for Survey Non-response. International Statistical Review, 78: 40-64. doi:10.1111/j.1751-5823.2010.00103.x
impute_in_classes()
, which is used for the construction of the imputation
classes.
impute_sRHD()
, which is used for the imputation.
# NOT RUN {
impute_hot_deck_in_classes(data.frame(
X = c(rep("A", 10), rep("B", 10)),
Y = c(rep(NA, 5), 106:120)
),
"X",
donor_limit = 1
)
# }
Run the code above in your browser using DataLab