Learn R Programming

maanova (version 1.42.0)

fill.missing: Fill in missing data

Description

This is the function to do missing data imputation.

Usage

fill.missing(data, method="knn", k=20, dist.method="euclidean")

Arguments

data
An object of class madata, which should be the result from read.madata.
method
The method to do missing data imputation. Currently only "knn" (K nearest neighbour) is implemented.
k
Number of neighbours used in imputation. Default is 20.
dist.method
The distance measure to be used. See dist for detail.

Value

An object of class madata with missing data filled in.

Details

This function will take an object of class madata and fill in the missing data. Currently only KNN (K nearest neighbour) algorithm is implemented. The memory usage is quadratic in the number of genes.

References

O.Troyanskaya, M. Cantor, G. Sherlock, P. Brown, T. Hastie, R. Tibshirani, D. Botstein, & R. B. Altman. Missing Value estimation methods for DNA microarrays. Bioinformatics 17(6):520-525, 2001.

Examples

Run this code
data(abf1)
# randomly generate some missing data 
rawdata <- abf1
ndata <- length(abf1$data)
pct.missing <- 0.05 # 5% missing
idx.missing <- sample(ndata, floor(ndata*pct.missing))
rawdata$data[idx.missing] <- NA
rawdata <- fill.missing(rawdata)
# plot impute data versus original data
plot(rawdata$data[idx.missing], abf1$data[idx.missing])
abline(0,1)

Run the code above in your browser using DataLab