Learn R Programming

mlr (version 2.17.0)

imputations: Built-in imputation methods.

Description

The built-ins are:

  • imputeConstant(const) for imputation using a constant value,

  • imputeMedian() for imputation using the median,

  • imputeMode() for imputation using the mode,

  • imputeMin(multiplier) for imputing constant values shifted below the minimum using min(x) - multiplier * diff(range(x)),

  • imputeMax(multiplier) for imputing constant values shifted above the maximum using max(x) + multiplier * diff(range(x)),

  • imputeNormal(mean, sd) for imputation using normally distributed random values. Mean and standard deviation will be calculated from the data if not provided.

  • imputeHist(breaks, use.mids) for imputation using random values with probabilities calculated using table or hist.

  • imputeLearner(learner, features = NULL) for imputations using the response of a classification or regression learner.

Usage

imputeConstant(const)

imputeMedian()

imputeMean()

imputeMode()

imputeMin(multiplier = 1)

imputeMax(multiplier = 1)

imputeUniform(min = NA_real_, max = NA_real_)

imputeNormal(mu = NA_real_, sd = NA_real_)

imputeHist(breaks, use.mids = TRUE)

imputeLearner(learner, features = NULL)

Arguments

const

(any) Constant valued use for imputation.

multiplier

(numeric(1)) Value that stored minimum or maximum is multiplied with when imputation is done.

min

(numeric(1)) Lower bound for uniform distribution. If NA (default), it will be estimated from the data.

max

(numeric(1)) Upper bound for uniform distribution. If NA (default), it will be estimated from the data.

mu

(numeric(1)) Mean of normal distribution. If missing it will be estimated from the data.

sd

(numeric(1)) Standard deviation of normal distribution. If missing it will be estimated from the data.

breaks

(numeric(1)) Number of breaks to use in graphics::hist. If missing, defaults to auto-detection via “Sturges”.

use.mids

(logical(1)) If x is numeric and a histogram is used, impute with bin mids (default) or instead draw uniformly distributed samples within bin range.

learner

(Learner | character(1)) Supervised learner. Its predictions will be used for imputations. If you pass a string the learner will be created via makeLearner. Note that the target column is not available for this operation.

features

(character) Features to use in learner for prediction. Default is NULL which uses all available features except the target column of the original task.

See Also

Other impute: impute(), makeImputeMethod(), makeImputeWrapper(), reimpute()