FS.DAAR.heuristic.RST: The DAAR heuristic for computation of decision reducts

Description

This function implements the Dynamically Adjusted Approximate Reducts heuristic (DAAR) for feature selection based on RST. The algorithm modifies the greedy approach to selecting attributes by introducing an additional stop condition. The algorithm stops when a random probe (permutation) test fails to reject a hypothesis that the selected attribute introduces illusionary dependency in data (in a context of previously selected attributes).

Usage

FS.DAAR.heuristic.RST(
  decision.table,
  attrDescriptions = attr(decision.table, "desc.attrs"),
  decisionIdx = attr(decision.table, "decision.attr"),
  qualityF = X.gini,
  nAttrs = NULL,
  allowedRandomness = 1/ncol(decision.table),
  nOfProbes = max(ncol(decision.table), 100),
  permsWithinINDclasses = FALSE,
  semigreedy = FALSE,
  inconsistentDecisionTable = NULL
)

Value

A class "FeatureSubset" that contains the following components:

reduct: a list representing a single reduct. In this case, it could be a superreduct or just a subset of features.
type.method: a string representing the type of method which is "greedy.heuristic".
type.task: a string showing the type of task which is "feature selection".
model: a string representing the type of model. In this case, it is "RST" which means rough set theory.
relevanceProbabilities: an intiger vector with estimated relevances of selected attributes.
epsilon: a value between 0 and 1 representing the estimated approximation threshold.

Arguments

decision.table

an object of a "DecisionTable" class representing a decision table. See SF.asDecisionTable.

attrDescriptions

a list containing possible values of attributes (columns) in decision.table. It usually corresponds to attr(decision.table, "desc.attrs").

decisionIdx

an integer value representing an index of the decision attribute.

qualityF

a function used for computation of the quality of attribute subsets. Currently, the following functions are included:

X.entropy: See X.entropy.
X.gini: See X.gini.
X.nOfConflicts: See X.nOfConflicts.

nAttrs

an integer between 1 and the number of conditional attributes. It indicates the attribute sample size for the Monte Carlo selection of candidating attributes. If set to NULL (default) all attributes are used and the algorithm changes to a standard greedy method for computation of decision reducts.

allowedRandomness

a threshold for attribute relevance. Computations will be terminated when the relevance of a selected attribute fall below this threshold.

nOfProbes

a number of random probes used for estimating the attribute relevance (see the references).

permsWithinINDclasses

a logical value indicating whether the permutation test should be conducted within indescernibility classes.

semigreedy

a logical indicating whether the semigreedy heuristic should be used for selecting the best attribute in each iteration of the algorithm

inconsistentDecisionTable

a logical indicating whether the decision table is suspected to be inconsistent or NULL (the default) which indicated that a test should be made to determine the data consistency.

Author

Andrzej Janusz

Details

As in the case of FS.greedy.heuristic.reduct.RST the implementation can use different attribute subset quality functions (parameter qualityF) and Monte Carlo generation of candidating attributes (parameter nAttrs).

References

A. Janusz and D. Ślęzak, "Random Probes in Computation and Assessment of Approximate Reducts", Proceedings of RSEISP 2014, Springer, LNCS vol. 8537: p. 53 - 64 (2014).

Andrzej Janusz and Dominik Slezak. "Computation of approximate reducts with dynamically adjusted approximation threshold". In Proceedings of ISMIS 2015, LNCS volume 9384, pages 19–28. Springer, 2015.

A. Janusz and S. Stawicki, "Applications of Approximate Reducts to the Feature Selection Problem", Proceedings of International Conference on Rough Sets and Knowledge Technology (RSKT), vol. 6954, p. 45 - 50 (2011).

Examples

Run this code

###################################################
## Example 1: Evaluate reduct and generate
##            new decision table
###################################################
data(RoughSetData)
decision.table <- RoughSetData$hiring.dt

## evaluate a single reduct
res.1 <- FS.DAAR.heuristic.RST(decision.table)

## generate a new decision table corresponding to the reduct
new.decTable <- SF.applyDecTable(decision.table, res.1)

Run the code above in your browser using DataLab