One Side Selection is an undersampling method resulting from the application of Tomek links followed by the application of Condensed Nearest Neighbor.
Usage
ubOSS(X, Y, verbose = TRUE)
Arguments
X
the input variables of the unbalanced dataset.
Y
the response variable of the unbalanced dataset.
It must be a binary factor where the majority class is coded as 0 and the minority as 1.
verbose
print extra information (TRUE/FALSE)
Value
The function returns a list:
X
input variables
Y
response variable
Details
In order to compute nearest neighbors, only numeric features are allowed.
References
M. Kubat, S. Matwin, et al. Addressing the curse of imbalanced training sets: one-sided selection. In MACHINE LEARNING-INTERNATIONAL WORKSHOP THEN CONFERENCE-, pages 179-186. MORGAN KAUFMANN PUBLISHERS, INC., 1997.