Learn R Programming

unbalanced (version 2.0)

ubNCL: Neighborhood Cleaning Rule

Description

Neighborhood Cleaning Rule modifies the Edited Nearest Neighbor method by increasing the role of data cleaning. Firstly, NCL removes negatives examples which are misclassified by their 3-nearest neighbors. Secondly, the neighbors of each positive examples are found and the ones belonging to the majority class are removed.

Usage

ubNCL(X, Y, k = 3, verbose = TRUE)

Arguments

X
the input variables of the unbalanced dataset.
Y
the response variable of the unbalanced dataset. It must be a binary factor where the majority class is coded as 0 and the minority as 1.
k
the number of neighbours to use
verbose
print extra information (TRUE/FALSE)

Value

The function returns a list:
X
input variables
Y
response variable

Details

In order to compute nearest neighbors, only numeric features are allowed.

References

J. Laurikkala. Improving identification of difficult small classes by balancing class distribution. Artificial Intelligence in Medicine, pages 63-66, 2001.

See Also

ubBalance

Examples

Run this code
library(unbalanced)
data(ubIonosphere)
n<-ncol(ubIonosphere)
output<-ubIonosphere$Class
input<-ubIonosphere[ ,-n]

data<-ubNCL(X=input, Y= output)
newData<-cbind(data$X, data$Y)

Run the code above in your browser using DataLab