FeatureImp: Feature importance

Description

FeatureImp computes feature importances for prediction models. The importance is measured as the factor by which the model's prediction error increases when the feature is shuffled.

Format

R6Class object.

Usage

imp = FeatureImp$new(predictor, loss, method = "shuffle", n.repetitions = 3, run = TRUE)

plot(imp) imp$results print(imp)

Arguments

For FeatureImp$new():

predictor:: (Predictor) The object (created with Predictor$new()) holding the machine learning model and the data.
loss:: (`character(1)` | function) The loss function. Either the name of a loss (e.g. "ce" for classification or "mse") or a loss function. See Details for allowed losses.
method:: (`character(1)` Either "shuffle" or "cartesian". See Details.
n.repetitions:: `numeric(1)` How often should the shuffling of the feature be repeated? Ignored if method is set to "cartesian". The higher the number of repetitions the more stable the results will become.
parallel:: `logical(1)` Should the method be executed in parallel? If TRUE, requires a cluster to be registered, see ?foreach::foreach.
run:: (`logical(1)`) Should the Interpretation method be run?

Fields

original.error:: (`numeric(1)`) The loss of the model before perturbing features.
predictor:: (Predictor) The prediction model that was analysed.
results:: (data.frame) data.frame with the results of the feature importance computation.

Methods

loss(actual,predicted): The loss function. Can also be applied to data: object$loss(actual, predicted)
plot(): method to plot the feature importances. See plot.FeatureImp
run(): [internal] method to run the interpretability method. Use obj$run(force = TRUE) to force a rerun.
clone(): [internal] method to clone the R6 object.
initialize(): [internal] method to initialize the R6 object.

Details

Read the Interpretable Machine Learning book to learn in detail about feature importance: https://christophm.github.io/interpretable-ml-book/feature-importance.html

Two permutation schemes are implemented:

shuffle: A simple shuffling of the feature values, yielding n perturbed instances per feature (fast)
cartesian: Matching every instance with the feature value of all other instances, yielding n x (n-1) perturbed instances per feature (very slow)

The loss function can be either specified via a string, or by handing a function to FeatureImp(). If you want to use your own loss function it should have this signature: function(actual, predicted). Using the string is a shortcut to using loss functions from the Metrics package. Only use functions that return a single performance value, not a vector. Allowed losses are: "ce", "f1", "logLoss", "mae", "mse", "rmse", "mape", "mdae", "msle", "percent_bias", "rae", "rmse", "rmsle", "rse", "rrse", "smape" See library(help = "Metrics") to get a list of functions.

References

Fisher, A., Rudin, C., and Dominici, F. (2018). Model Class Reliance: Variable Importance Measures for any Machine Learning Model Class, from the "Rashomon" Perspective. Retrieved from http://arxiv.org/abs/1801.01489

Examples

Run this code

# NOT RUN {
if (require("rpart")) {
# We train a tree on the Boston dataset:
data("Boston", package  = "MASS")
tree = rpart(medv ~ ., data = Boston)
y = Boston$medv
X = Boston[-which(names(Boston) == "medv")]
mod = Predictor$new(tree, data = X, y = y)


# Compute feature importances as the performance drop in mean absolute error
imp = FeatureImp$new(mod, loss = "mae")

# Plot the results directly
plot(imp)


# Since the result is a ggplot object, you can extend it: 
if (require("ggplot2")) {
  plot(imp) + theme_bw()
  # If you want to do your own thing, just extract the data: 
  imp.dat = imp$results
  head(imp.dat)
  ggplot(imp.dat, aes(x = feature, y = importance)) + geom_point() + 
  theme_bw()
}

# FeatureImp also works with multiclass classification. 
# In this case, the importance measurement regards all classes
tree = rpart(Species ~ ., data= iris)
X = iris[-which(names(iris) == "Species")]
y = iris$Species
mod = Predictor$new(tree, data = X, y = y, type = "prob") 

# For some models we have to specify additional arguments for the predict function
imp = FeatureImp$new(mod, loss = "ce")
plot(imp)

# For multiclass classification models, you can choose to only compute performance for one class. 
# Make sure to adapt y
mod = Predictor$new(tree, data = X, y = y == "virginica", 
 type = "prob", class = "virginica") 
imp = FeatureImp$new(mod, loss = "ce")
plot(imp)
}
# }

Run the code above in your browser using DataLab