This function implements a greedy heuristic algorithm for computing decision reducts (or approximate decision reducts) based on RST.
FS.greedy.heuristic.reduct.RST(
decision.table,
attrDescriptions = attr(decision.table, "desc.attrs"),
decisionIdx = attr(decision.table, "decision.attr"),
qualityF = X.gini,
nAttrs = NULL,
epsilon = 0,
inconsistentDecisionTable = NULL
)
A class "FeatureSubset"
that contains the following components:
reduct
: a list representing a single reduct. In this case, it could be a superreduct or just a subset of features.
type.method
: a string representing the type of method which is "greedy.heuristic"
.
type.task
: a string showing the type of task which is "feature selection"
.
model
: a string representing the type of model. In this case, it is "RST"
which means rough set theory.
epsilon
: the approximation threshold.
an object of a "DecisionTable"
class representing a decision table. See SF.asDecisionTable
.
a list containing possible values of attributes (columns) in decision.table
.
It usually corresponds to attr(decision.table, "desc.attrs")
.
an integer value representing an index of the decision attribute.
a function used for computation of the quality of attribute subsets. Currently, the following functions are included:
X.entropy
: See X.entropy
.
X.gini
: See X.gini
.
X.nOfConflicts
: See X.nOfConflicts
.
an integer between 1 and the number of conditional attributes. It indicates
the attribute sample size for the Monte Carlo selection of candidating attributes.
If set to NULL
(default) all attributes are used and the algorithm changes
to a standard greedy method for computation of decision reducts.
a numeric value between [0, 1) representing an approximate threshold. It indicates whether to compute approximate reducts or not. If it equals 0 (the default) a standard decision reduct is computed.
logical indicating whether the decision table is suspected
to be inconsistent or NULL
(the default) which indicated that a test should
be made to determine the data consistency.
Andrzej Janusz
In this implementation, we provided some attribute subset quality measures which can be
passed to the algorithm by the parameter qualityF
. Those measures
guide the computations in the search for a decision/approximated reduct. They are used to
assess amount of information gained after addition of an attribute. For example,
X.entropy
corresponds to the information gain measure.
Additionally, this function can use the value of epsilon
parameter in order to compute
\(\epsilon\)-approximate reducts. The \(\epsilon\)-approximate can be defined as an
irreducable subset of attributes B
, such that:
\(Quality_{\mathcal{A}}(B) \ge (1 - \epsilon)Quality_{\mathcal{A}}(A)\),
where \(Quality_{\mathcal{A}}(B)\) is the value of a quality measure (see possible values
of the parameter qualityF
) for an attribute subset \(B\) in decision table \(\mathcal{A}\)
and \(\epsilon\) is a numeric value between 0 and 1 expressing the approximation threshold.
A lot of monographs provide comprehensive explanations about this topics, for example
(Janusz and Stawicki, 2011; Slezak, 2002; Wroblewski, 2001) which are used as the references of this function.
Finally, this implementation allows to restrain the computational complexity of greedy
searching for decision reducts by setting the value of the parameter nAttrs
. If this
parameter is set to a positive integer, the Monte Carlo method of selecting candidating
attributes will be used in each iteration of the algorithm.
Andrzej Janusz and Dominik Slezak. "Rough Set Methods for Attribute Clustering and Selection". Applied Artificial Intelligence, 28(3):220–242, 2014.
A. Janusz and S. Stawicki, "Applications of Approximate Reducts to the Feature Selection Problem", Proceedings of International Conference on Rough Sets and Knowledge Technology (RSKT), vol. 6954, p. 45 - 50 (2011).
D. Ślęzak, "Approximate Entropy Reducts", Fundamenta Informaticae, vol. 53, no. 3 - 4, p. 365 - 390 (2002).
J. Wróblewski, "Ensembles of Classifiers Based on Approximate Reducts", Fundamenta Informaticae, vol. 47, no. 3 - 4, p. 351 - 360 (2001).
FS.DAAR.heuristic.RST
and FS.reduct.computation
.
###################################################
## Example 1: Evaluate reduct and generate
## new decision table
###################################################
data(RoughSetData)
decision.table <- RoughSetData$hiring.dt
## evaluate a single reduct
res.1 <- FS.greedy.heuristic.reduct.RST(decision.table, qualityF = X.entropy,
epsilon = 0.0)
## generate a new decision table corresponding to the reduct
new.decTable <- SF.applyDecTable(decision.table, res.1)
Run the code above in your browser using DataLab