varimp: Variable Importance

Description

Calculate measures of the relative importance of predictors in a model.

Usage

varimp(object, method = c("permute", "model"), scale = TRUE, ...)

Value

VariableImportance class object.

Arguments

object

model fit result.

method

character string specifying the calculation of variable importance as permutation-base ("permute") or model-specific ("model"). If model-specific importance is specified but not defined, the permutation-based method will be used instead with its default values (below). Permutation-based variable importance is defined as the relative change in model predictive performances between datasets with and without permuted values for the associated variable (Fisher et al. 2019).

scale

logical indicating whether importance values should be scaled to a maximum of 100.

...

arguments passed to model-specific or permutation-based variable importance functions. These include the following arguments and default values for method = "permute".

select = NULL: expression indicating predictor variables for which to compute variable importance (see subset for syntax) [default: all].

samples = 1

number of times to permute the values of each variable. Larger numbers of samples decrease variability in the estimates at the expense of increased computation time.

prop = numeric()

proportion of observations to sample without replacement at each round of variable permutations [default: all]. Subsampling of observations can decrease computation time.

size = integer()

number of observations to sample at each round of permutations [default: all].

times = numeric()

numeric vector of follow-up times at which to predict survival probabilities or NULL for predicted survival means.

metric = NULL

metric function or function name with which to calculate performance. If not specified, the first applicable default metric from the performance functions is used.

compare = c("-", "/")

character specifying the relative change to compute in comparing model predictive performances between datasets with and without permuted values. The choices are difference ("-") and ratio ("/").

stats = MachineShop::settings("stat.TrainingParams")

function, function name, or vector of these with which to compute summary statistics on the set of variable importance values from the permuted datasets.

na.rm = TRUE

logical indicating whether to exclude missing variable importance values from the calculation of summary statistics.

progress = TRUE

logical indicating whether to display iterative progress during computation.

References

Fisher, A., Rudin, C., & Dominici, F. (2019). All models are wrong, but many are useful: Learning a variable's importance by studying an entire class of prediction models simultaneously. Journal of Machine Learning Research, 20, 1-81.

Examples

Run this code

# \donttest{
## Requires prior installation of suggested package gbm to run

## Survival response example
library(survival)

gbm_fit <- fit(Surv(time, status) ~ ., data = veteran, model = GBMModel)
(vi <- varimp(gbm_fit))
plot(vi)
# }