impute_LS_adaptive: LSimpute_adaptive

Description

Perform LSimpute_adaptive as described by Bo et al. (2004)

Usage

impute_LS_adaptive(
  ds,
  k = 10,
  eps = 1e-06,
  min_common_obs = 5,
  r_max_min = 100,
  p_mis_sim = 0.05,
  warn_r_max = FALSE,
  verbose_gene = FALSE,
  verbose_array = FALSE,
  verbose_gene_p = FALSE,
  verbose_array_p = FALSE
)

Value

An object of the same class as ds with imputed missing values.

Arguments

ds: A data frame or matrix with missing values.
k: Directly passed to impute_LS_gene().
eps: Directly passed to impute_LS_gene().
min_common_obs: Directly passed to impute_LS_gene().
r_max_min: Minimum number of nearest genes used for imputation. The default value (100) corresponds to the choice of Bo et al. (2004).
p_mis_sim: Percentage of observed values that are set NA to estimate the mixing coefficient p. The default value (0.05) corresponds to the choice of Bo et al. (2004).
warn_r_max: Should a warning be given, if r_max_min is set too high?
verbose_gene: Should impute_LS_gene() be verbose?
verbose_array: Should impute_LS_array() be verbose?
verbose_gene_p: Should impute_LS_gene() be verbose while estimating p?
verbose_array_p: Should impute_LS_array() be verbose while estimating p?

Details

This function performs LSimpute_adaptive as described by Bo et al. (2004).The function assumes that the genes are the rows of ds.

LSimpute_adaptive combines imputation values from impute_LS_gene() and impute_LS_array() using a local (adaptive) approach for the mixing coefficient p.

If the dataset is too small or has too many missing values, there are some fallback systems implemented. First, if ncol(ds) <= min_common_obs (normally, this should not the case!), values are imputed through impute_LS_array(). Second, r_max_min is automatically adjusted, if it is too high. In this case, a warning will be given, if warn_r_max = TRUE. Third, if there are not enough observed values in a row (less than min_common_obs), the calculation of the mixing coefficient is not possible and missing values of these rows are imputed with the values from impute_LS_array().

The amount of feedback given from impute_LS_gene() and impute_LS_array() is controlled via verbose_gene, verbose_array, verbose_gene_p and verbose_array_p. The last two control the amount of feedback while estimating p and the first two the amount of feedback during the estimation of the values that are mixed with p. Internally, the imputed dataset from impute_LS_gene() is passed on to impute_LS_array(). Therefore, all messages from impute_LS_gene() are truly from impute_LS_gene() and not a part of impute_LS_array(), which never calls impute_LS_gene() in this case. Furthermore, all messages from impute_expected_values() belong to impute_LS_array().

References

Bo, T. H., Dysvik, B., & Jonassen, I. (2004). LSimpute: accurate estimation of missing values in microarray data with least squares methods. Nucleic acids research, 32(3), e34

Examples

Run this code

set.seed(123)
ds_mis <- delete_MCAR(mvtnorm::rmvnorm(100, rep(0, 10)), 0.1)
ds_imp <- impute_LS_adaptive(ds_mis)

Run the code above in your browser using DataLab