Learn R Programming

missMethods (version 0.4.0)

apply_imputation: Apply a function for imputation

Description

Apply a function for imputation over rows, columns or combinations of both

Usage

apply_imputation(
  ds,
  FUN = mean,
  type = "columnwise",
  convert_tibble = TRUE,
  ...
)

Value

An object of the same class as ds with imputed missing values.

Arguments

ds

A data frame or matrix with missing values.

FUN

The function to be applied for imputation.

type

A string specifying the values used for imputation; one of: "columnwise", "rowwise", "total", "Two-Way" or "Winer" (see details).

convert_tibble

If ds is a tibble, should it be converted (see section A note for tibble users).

...

Further arguments passed to FUN.

A note for tibble users

If you use tibbles and convert_tibble is TRUE the tibble is first converted to a data frame, then imputed and converted back. If convert_tibble is FALSE no conversion is done. However, depending on the tibble and the package version of tibble you use, imputation may not be possible and some errors will be thrown.

Details

The functionality of apply_imputation is inspired by the apply function. The function applies a function FUN to impute the missing values in ds. FUN must be a function, which takes a vector as input and returns exactly one value. The argument type is comparable to apply's MARGIN argument. It specifies the values that are used for the calculation of the imputation values. For example, type = "columnwise" and FUN = mean will impute the mean of the observed values in a column for all missing values in this column. In contrast, type = "rowwise" and FUN = mean will impute the mean of the observed values in a row for all missing values in this row.

List of all implemented types:

  • "columnwise" (the default): imputes column by column; all observed values of a column are given to FUN and the returned value is used as the imputation value for all missing values of the column.

  • "rowwise": imputes row by row; all observed values of a row are given to FUN and the returned value is used as the imputation value for all missing values of the row.

  • "total": All observed values of ds are given to FUN and the returned value is used as the imputation value for all missing values of ds.

  • "Winer": The mean value from "columnwise" and "rowwise" is used as the imputation value.

  • "Two-Way": The sum of the values from "columnwise" and "rowwise" minus "total" is used as the imputation value.

If no value can be given to FUN (for example, if no value in a column is observed and type = "columnwise"), then a warning will be issued and no value will be imputed in the corresponding column or row.

References

Beland, S., Pichette, F., & Jolani, S. (2016). Impact on Cronbach's \(\alpha\) of simple treatment methods for missing data. The Quantitative Methods for Psychology, 12(1), 57-73.

See Also

A convenient interface exists for common cases like mean imputation: impute_mean, impute_median, impute_mode. All these functions call apply_imputation.

Examples

Run this code
ds <- data.frame(X = 1:20, Y = 101:120)
ds_mis <- delete_MCAR(ds, 0.2)
ds_imp_app <- apply_imputation(ds_mis, FUN = mean, type = "total")
# the same result can be achieved via impute_mean():
ds_imp_mean <- impute_mean(ds_mis, type = "total")
all.equal(ds_imp_app, ds_imp_mean)

Run the code above in your browser using DataLab