Learn R Programming

dprep (version 3.0.2)

ce.impute: Imputation in supervised classification

Description

This function performs data imputation in datasets for supervised classification by using mean, median or knn imputation methods. The mode is used when the attribute is nominal

Usage

ce.impute(data, method = c("mean", "median", "knn"), atr, nomatr = rep(0, 0), k1 = 10)

Arguments

data
the name of the dataset
method
the name of the method to be used
atr
a vector identifying the attributes where imputations will be performed
nomatr
a vector identifying the nominal attributes
k1
the number of neighbors to be used for the knn imputation

Value

Returns a matrix without missing values.

References

Acuna, E. and Rodriguez, C. (2004). The treatment of missing values and its effect in the classifier accuracy. In D. Banks, L. House, F.R. McMorris, P. Arabie, W. Gaul (Eds). Classification, Clustering and Data Mining Applications. Springer-Verlag Berlin-Heidelberg, 639-648.

See Also

clean

Examples

Run this code
data(hepatitis)
#--------Median Imputation-----------
#ce.impute(hepatitis,"median",1:19)
#--------knn Imputation--------------
hepa.imputed=ce.impute(hepatitis,"knn",k1=10)

Run the code above in your browser using DataLab