Learn R Programming

mfe (version 0.1.5)

relative: Relative Landmarking Meta-features

Description

Relative Landmarking measures are landmarking measures using ranking strategy.

Usage

relative(...)

# S3 method for default relative( x, y, features = "all", summary = c("mean", "sd"), size = 1, folds = 10, score = "accuracy", ... )

# S3 method for formula relative( formula, data, features = "all", summary = c("mean", "sd"), size = 1, folds = 10, score = "accuracy", ... )

Arguments

...

Further arguments passed to the summarization functions.

x

A data.frame contained only the input attributes.

y

A factor response vector with one label for each row/component of x.

features

A list of features names or "all" to include all them.

summary

A list of summarization functions or empty for all values. See post.processing method to more information. (Default: c("mean", "sd"))

size

The percentage of examples subsampled. Values different from 1 generate the subsampling-based relative landmarking metafeatures. (Default: 1.0)

folds

The number of k equal size subsamples in k-fold cross-validation.(Default: 10)

score

The evaluation measure used to score the classification performance. c("accuracy", "balanced.accuracy", "kappa"). (Default: "accuracy").

formula

A formula to define the class column.

data

A data.frame dataset contained the input attributes and class. The details section describes the valid values for this group.

Value

A list named by the requested meta-features.

Details

The following features are allowed for this method:

"bestNode"

Construct a single decision tree node model induced by the most informative attribute to establish the linear separability (multi-valued).

"eliteNN"

Elite nearest neighbor uses the most informative attribute in the dataset to induce the 1-nearest neighbor. With the subset of informative attributes is expected that the models should be noise tolerant (multi-valued).

"linearDiscr"

Apply the Linear Discriminant classifier to construct a linear split (non parallel axis) in the data to establish the linear separability (multi-valued).

"naiveBayes"

Evaluate the performance of the Naive Bayes classifier. It assumes that the attributes are independent and each example belongs to a certain class based on the Bayes probability (multi-valued).

"oneNN"

Evaluate the performance of the 1-nearest neighbor classifier. It uses the euclidean distance of the nearest neighbor to determine how noisy is the data (multi-valued).

"randomNode"

Construct a single decision tree node model induced by a random attribute. The combination with "bestNode" measure can establish the linear separability (multi-valued).

"worstNode"

Construct a single decision tree node model induced by the worst informative attribute. The combination with "bestNode" measure can establish the linear separability (multi-valued).

References

Johannes Furnkranz, Johann Petrak, Pavel Brazdil, and Carlos Soares. On the use of Fast Subsampling Estimates for Algorithm Recommendation. Technical Report, pages 1-9, 2002.

See Also

Other meta-features: clustering(), complexity(), concept(), general(), infotheo(), itemset(), landmarking(), model.based(), statistical()

Examples

Run this code
# NOT RUN {
## Extract all meta-features using formula
relative(Species ~ ., iris)

## Extract some meta-features
relative(iris[1:4], iris[5], c("bestNode", "randomNode", "worstNode"))

## Use another summarization function
relative(Species ~ ., iris, summary=c("min", "median", "max"))

## Use 2 folds and balanced accuracy
relative(Species ~ ., iris, folds=2, score="balanced.accuracy")

## Extract the subsapling relative landmarking
relative(Species ~ ., iris, size=0.7)
# }

Run the code above in your browser using DataLab