Learn R Programming

missMethods (version 0.4.0)

impute_mean: Mean imputation

Description

Impute an observed mean for the missing values

Usage

impute_mean(ds, type = "columnwise", convert_tibble = TRUE)

Value

An object of the same class as ds with imputed missing values.

Arguments

ds

A data frame or matrix with missing values.

type

A string specifying the values used for imputation; one of: "columnwise", "rowwise", "total", "Two-Way" or "Winer" (see details).

convert_tibble

If ds is a tibble, should it be converted (see section A note for tibble users).

A note for tibble users

If you use tibbles and convert_tibble is TRUE the tibble is first converted to a data frame, then imputed and converted back. If convert_tibble is FALSE no conversion is done. However, depending on the tibble and the package version of tibble you use, imputation may not be possible and some errors will be thrown.

Details

For every missing value the mean of some observed values is imputed. The observed values to be used are specified via type. For example, type = "columnwise" (the default) imputes the mean of the observed values in a column for all missing values in the column. This is normally meant, if someone speaks of "imputing the mean" or "mean imputation".

Other options for type are: "rowwise", "total", "Winer" and "Two-way". The option "rowwise" imputes all missing values in a row with the mean of the observed values in the same row. "total" will impute every missing value with the mean of all observed values in ds. "Winer" imputes the mean of the rowwise and columnwise mean. Beland et al. (2016) called this method "Winer" and they attributed the method to Winer (1971). "Two-way" imputes the sum of rowwise and columnwise mean minus the total mean. This method was suggested by D.B Rubin to Bernaards & Sijtsma, K. (2000).

References

Beland, S., Pichette, F., & Jolani, S. (2016). Impact on Cronbach's \(\alpha\) of simple treatment methods for missing data. The Quantitative Methods for Psychology, 12(1), 57-73.

Bernaards, C. A., & Sijtsma, K. (2000). Influence of imputation and EM methods on factor analysis when item nonresponse in questionnaire data is nonignorable. Multivariate Behavioral Research, 35(3), 321-364.

Winer, B. J. (1971). Statistical principles in experimental design (2ed ed.) New York: McGraw-Hill

See Also

apply_imputation the workhorse for this function.

Other location parameter imputation functions: impute_median(), impute_mode()

Examples

Run this code
ds <- data.frame(X = 1:20, Y = 101:120)
ds_mis <- delete_MCAR(ds, 0.2)
ds_imp <- impute_mean(ds_mis)
# completely observed columns can be of any type:
ds_mis_char <- cbind(ds_mis, letters[1:20])
ds_imp_char <- impute_mean(ds_mis_char)

Run the code above in your browser using DataLab