relief: RELIEF Feature Selection

Description

This function implements the RELIEF feature selection algorithm.

Usage

relief(data, nosample, threshold,repet=1)

Arguments

data

The dataset for which feature selection will be carried out

nosample

The number of instances drawn from the original dataset

threshold

The cutoff point to select the features

repet

The number of repetitions. It is recommended to use at most 10 repetitions

Value

relevant: A table that gives the ratio between the frequency with which the feature was selected as relevant and the total number of trials performed in one column, and the average weight of the feature in another.
a plot: A plot of the weights of the features

Details

The general idea of this method is to choose the features that can be most distinguished between classes. These are known as the relevant features. At each step of an iterative process, an instance x is chosen at random from the dataset and the weight for each feature is updated according to the distance of x to its Nearmiss and NearHit. The dataset must have complete cases therefore imputation must be performed in advance.

References

KIRA, K. and RENDEL, L. (1992). The Feature Selection Problem : Traditional Methods and a new algorithm. Proc. Tenth National Conference on Artificial Intelligence, MIT Press, 129-134.

KONONENKO, I., SIMEC, E., and ROBNIK-SIKONJA, M. (1997). Overcoming the myopia of induction learning algorithms with RELIEFF. Applied Intelligence Vol7, 1, 39-55.

Examples

Run this code

##---- Feature Selection ---
data(iris)
relief(iris,150,0.01,repet=1)

Run the code above in your browser using DataLab