Create missing not at random (MNAR) values using a censoring mechanism in a data frame or a matrix
delete_MNAR_censoring(
ds,
p,
cols_mis,
n_mis_stochastic = FALSE,
where = "lower",
sorting = TRUE,
miss_cols
)
An object of the same class as ds
with missing values.
A data frame or matrix in which missing values will be created.
A numeric vector with length one or equal to length cols_mis
;
the probability that a value is missing.
A vector of column names or indices of columns in which missing values will be created.
Logical, should the number of missing values be
stochastic? If n_mis_stochastic = TRUE
, the number of missing values
for a column with missing values cols_mis[i]
is a random variable
with expected value nrow(ds) * p[i]
. If n_mis_stochastic =
FALSE
, the number of missing values will be deterministic. Normally, the
number of missing values for a column with missing values
cols_mis[i]
is round(nrow(ds) * p[i])
. Possible deviations
from this value, if any exists, are documented in Details.
Controls where missing values are created; one of "lower", "upper" or "both" (see details).
Logical; should sorting be used or a quantile as a threshold.
Deprecated, use cols_mis
instead.
The functions delete_MNAR_censoring
and delete_MAR_censoring
are sisters. The only difference between these two functions is the column that controls the generation of missing values. In delete_MAR_censoring
a separate column cols_ctrl[i]
controls the generation of missing values in cols_mis[i]
. In contrast, in delete_MNAR_censoring
the generation of missing values in cols_mis[i]
is controlled by cols_mis[i]
itself. All other aspects are identical for both functions. Therefore, further details can be found in delete_MAR_censoring
.
Santos, M. S., Pereira, R. C., Costa, A. F., Soares, J. P., Santos, J., & Abreu, P. H. (2019). Generating Synthetic Missing Data: A Review by Missing Mechanism. IEEE Access, 7, 11651-11667
delete_MAR_censoring
Other functions to create MNAR:
delete_MNAR_1_to_x()
,
delete_MNAR_one_group()
,
delete_MNAR_rank()
ds <- data.frame(X = 1:20, Y = 101:120)
delete_MNAR_censoring(ds, 0.2, "X")
Run the code above in your browser using DataLab