Various parameters that control aspects of the C5.0 fit.
C5.0Control(
subset = TRUE,
bands = 0,
winnow = FALSE,
noGlobalPruning = FALSE,
CF = 0.25,
minCases = 2,
fuzzyThreshold = FALSE,
sample = 0,
seed = sample.int(4096, size = 1) - 1L,
earlyStopping = TRUE,
label = "outcome"
)
A logical: should the model evaluate groups of
discrete predictors for splits? Note: the C5.0 command line
version defaults this parameter to FALSE
, meaning no
attempted groupings will be evaluated during the tree growing
stage.
An integer between 2 and 1000. If TRUE
, the
model orders the rules by their affect on the error rate and
groups the rules into the specified number of bands. This
modifies the output so that the effect on the error rate can be
seen for the groups of rules within a band. If this options is
selected and rules = FALSE
, a warning is issued and
rules
is changed to TRUE
.
A logical: should predictor winnowing (i.e feature selection) be used?
A logical to toggle whether the final, global pruning step to simplify the tree.
A number in (0, 1) for the confidence factor.
an integer for the smallest number of samples that must be put in at least two of the splits.
A logical toggle to evaluate possible advanced splits of the data. See Quinlan (1993) for details and examples.
A value between (0, .999) that specifies the random proportion of the data should be used to train the model. By default, all the samples are used for model training. Samples not used for training are used to evaluate the accuracy of the model in the printed output.
An integer for the random number seed within the C code.
A logical to toggle whether the internal method for stopping boosting should be used.
A character label for the outcome used in the output. @return A list of options.
Original GPL C code by Ross Quinlan, R code and modifications to C by Max Kuhn, Steve Weston and Nathan Coulter
Quinlan R (1993). C4.5: Programs for Machine Learning. Morgan Kaufmann Publishers, http://www.rulequest.com/see5-unix.html
C5.0()
,predict.C5.0()
,
summary.C5.0()
, C5imp()
library(modeldata)
data(mlc_churn)
treeModel <- C5.0(x = mlc_churn[1:3333, -20],
y = mlc_churn$churn[1:3333],
control = C5.0Control(winnow = TRUE))
summary(treeModel)
Run the code above in your browser using DataLab