A wrapper method of backward feature selection in which a given model is fit to nested subsets of most important predictor variables in order to select the subset whose resampled predictive performance is optimal.
rfe(...)# S3 method for default
rfe(
object,
model = NULL,
control = MachineShop::settings("control"),
props = 4,
sizes = integer(),
random = FALSE,
recompute = TRUE,
optimize = c("global", "local"),
samples = c(rfe = 1, varimp = 1),
metrics = NULL,
stat = c(resample = MachineShop::settings("stat.Resample"), permute =
MachineShop::settings("stat.TrainingParams")),
progress = FALSE,
...
)
# S3 method for formula
rfe(formula, data, model, ...)
# S3 method for matrix
rfe(x, y, model, ...)
# S3 method for ModelFrame
rfe(input, model = NULL, ...)
# S3 method for recipe
rfe(input, model = NULL, ...)
# S3 method for ModelSpecification
rfe(object, ...)
# S3 method for MLModel
rfe(model, ...)
# S3 method for MLModelFunction
rfe(model, ...)
arguments passed to the default method from the others. The
first argument of each rfe
method is positional and, as such, must
be given first in calls to them.
model input or specification.
model function, function name, or object; or another object that can be coerced to a model. A model can be given first followed by any of the variable specifications, and the argument can be omitted altogether in the case of modeled inputs.
control function, function name, or object defining the resampling method to be employed.
numeric vector of the proportions of most important predictor
variables to retain in fitted models or an integer number of equal spaced
proportions to generate automatically; ignored if sizes
are given.
integer vector of the set sizes of most important predictor variables to retain.
logical indicating whether to eliminate variables at random with probabilities proportional to their importance.
logical indicating whether to recompute variable importance after eliminating each set of variables.
character string specifying a search through all props
to identify the globally optimal model ("global"
) or a search that
stops after identifying the first locally optimal model ("local"
).
numeric vector or list giving the number of permutation
samples for each of the rfe
and varimp
algorithms.
One or both of the values may be specified as named arguments or in the
order in which their defaults appear. Larger numbers of samples decrease
variability in estimated model performances and variable importances at the
expense of increased computation time. Samples are more expensive
computationally for rfe
than for varimp
.
metric function, function name, or vector of these with which to calculate performance. If not specified, default metrics defined in the performance functions are used.
functions or character strings naming functions to compute summary statistics on resampled metric values and permuted samples. One or both of the values may be specified as named arguments or in the order in which their defaults appear.
logical indicating whether to display iterative progress during elimination.
formula defining the model predictor and response variables and a data frame containing them.
matrix and object containing predictor and response variables.
input object defining and containing the model predictor and response variables.
TrainingStep
class object containing a summary of the numbers
of predictor variables retained (size), their names (terms), logical
indicators for the optimal model selected (selected), and associated
performance metrics (metrics).
# NOT RUN {
## Requires prior installation of suggested package gbm to run
(res <- rfe(sale_amount ~ ., data = ICHomes, model = GBMModel))
summary(res)
summary(performance(res))
plot(res, type = "line")
# }
# NOT RUN {
# }
Run the code above in your browser using DataLab