Learn R Programming

missMethods (version 0.2.0)

apply_imputation: Apply a function for imputation

Description

Apply a function for imputation over rows, columns or combinations of both

Usage

apply_imputation(ds, FUN = mean, type = "columnwise", ...)

Arguments

ds

A data frame or matrix with missing values.

FUN

The function to be applied for imputation.

type

A string specifying the values used for imputation (see details).

...

Further arguments passed to FUN.

Value

An object of the same class as ds with imputed missing values.

A Note for tibble users

If you use tibbles and an error like ‘Lossy cast from `value` double to integer’ occurs, you will first need to convert all integer columns with missing values to double. Another solution is to convert the tibble with as.data.frame() to a data frame. The data frame will automatically convert integer columns to double columns, if needed.

Details

The functionality of apply_imputation is inspired by the apply function. The function applies a function FUN to impute the missing values in ds. FUN must be a function, which takes a vector as input and returns exactly one value. The argument type is comparable to apply's MARGIN argument. It specifies the values that are used for the calculation of the imputation values. For example, type = "columnwise" and FUN = mean will impute the mean of the observed values in a column for all missing values in this column. In contrast, type = "rowwise" and FUN = mean will impute the mean of the observed values in a row for all missing values in this row.

List of all implemented types:

  • "columnwise" (the default): imputes column by column; all observed values of a column are given to FUN and the returned value is used as the imputation value for all missing values of the column.

  • "rowwise": imputes row by row; all observed values of a row are given to FUN and the returned value is used as the imputation value for all missing values of the row.

  • "total": All observed values of ds are given to FUN and the returned value is used as the imputation value for all missing values of ds.

  • "Winer": The mean value from "columnwise" and "rowwise" is used as the imputation value.

  • "Two-way": The sum of the values from "columnwise" and "rowwise" minus "total" is used as the imputation value.

If no value can be given to FUN (for example, if no value in a column is observed and type = "columnwise"), then a warning will be issued and no value will be imputed in the corresponding column or row.

References

Beland, S., Pichette, F., & Jolani, S. (2016). Impact on Cronbach's \(\alpha\) of simple treatment methods for missing data. The Quantitative Methods for Psychology, 12(1), 57-73.

See Also

A convenient interface exists for common cases like mean imputation: impute_mean, impute_median, impute_mode. All these functions call apply_imputation.

Examples

Run this code
# NOT RUN {
ds <- data.frame(X = 1:20, Y = 101:120)
ds_mis <- delete_MCAR(ds, 0.2)
ds_imp_app <- apply_imputation(ds_mis, FUN = mean, type = "total")
# the same result can be achieved via impute_mean():
ds_imp_mean <- impute_mean(ds_mis, type = "total")
all.equal(ds_imp_app, ds_imp_mean)
# }

Run the code above in your browser using DataLab