Learn R Programming

MachineShop (version 3.8.0)

C50Model: C5.0 Decision Trees and Rule-Based Model

Description

Fit classification tree models or rule-based models using Quinlan's C5.0 algorithm.

Usage

C50Model(
  trials = 1,
  rules = FALSE,
  subset = TRUE,
  bands = 0,
  winnow = FALSE,
  noGlobalPruning = FALSE,
  CF = 0.25,
  minCases = 2,
  fuzzyThreshold = FALSE,
  sample = 0,
  earlyStopping = TRUE
)

Value

MLModel class object.

Arguments

trials

integer number of boosting iterations.

rules

logical indicating whether to decompose the tree into a rule-based model.

subset

logical indicating whether the model should evaluate groups of discrete predictors for splits.

bands

integer between 2 and 1000 specifying a number of bands into which to group rules ordered by their affect on the error rate.

winnow

logical indicating use of predictor winnowing (i.e. feature selection).

noGlobalPruning

logical indicating a final, global pruning step to simplify the tree.

CF

number in (0, 1) for the confidence factor.

minCases

integer for the smallest number of samples that must be put in at least two of the splits.

fuzzyThreshold

logical indicating whether to evaluate possible advanced splits of the data.

sample

value between (0, 0.999) that specifies the random proportion of data to use in training the model.

earlyStopping

logical indicating whether the internal method for stopping boosting should be used.

Details

Response types:

factor

Automatic tuning of grid parameters:

trials, rules, winnow

Latter arguments are passed to C5.0Control. Further model details can be found in the source link below.

In calls to varimp for C50Model, argument type may be specified as "usage" (default) for the percentage of training set samples that fall into all terminal nodes after the split of each predictor or as "splits" for the percentage of splits associated with each predictor. Variable importance is automatically scaled to range from 0 to 100. To obtain unscaled importance values, set scale = FALSE. See example below.

See Also

C5.0, fit, resample

Examples

Run this code
# \donttest{
## Requires prior installation of suggested package C50 to run

model_fit <- fit(Species ~ ., data = iris, model = C50Model)
varimp(model_fit, method = "model", type = "splits", scale = FALSE)
# }

Run the code above in your browser using DataLab