Learn R Programming

dprep (version 3.0.2)

sfs: Sequential Forward Selection

Description

Applies the Sequential Forward Selection algorithm for Feature Selection.

Usage

sfs(data, method = c("lda", "knn", "rpart"), kvec = 5, repet = 10)

Arguments

data
Dataset to be used for feature selection
method
Classifier to be used, currently only the lda, knn and rpart classifiers are supported
kvec
Number of neighbors to use for the knn classification
repet
Number of times to repeat the selection.

Value

bestsubset
subset of features that have been determined to be relevant.

Details

The best subset of features, T, is initialized as the empty set and at each step the feature that gives the highest correct classification rate along with the features already in T, is added to set. The "best subset" of features is constructed based on the frequency with which each attribute is selected in the number of repetitions given. Due to the time complexity of the algorithm its use is not recommended for datasets with a large number of attributes(say more than 1000).

References

Acuna, E , (2003) A comparison of filters and wrappers for feature selection in supervised classification. Proceedings of the Interface 2003 Computing Science and Statistics. Vol 34.

Examples

Run this code
#---- Sequential forward selection using the knn classifier----
data(iris)
sfs(iris,method="lda",repet=3)

Run the code above in your browser using DataLab