xgb.shap.data: Prepare data for SHAP plots. To be used in xgb.plot.shap, xgb.plot.shap.summary, etc. Internal utility function.

Description

Prepare data for SHAP plots. To be used in xgb.plot.shap, xgb.plot.shap.summary, etc. Internal utility function.

Usage

xgb.shap.data(
  data,
  shap_contrib = NULL,
  features = NULL,
  top_n = 1,
  model = NULL,
  trees = NULL,
  target_class = NULL,
  approxcontrib = FALSE,
  subsample = NULL,
  max_observations = 1e+05
)

Value

A list containing: 'data', a matrix containing sample observations and their feature values; 'shap_contrib', a matrix containing the SHAP contribution values for these observations.

Arguments

data: data as a matrix or dgCMatrix.
shap_contrib: a matrix of SHAP contributions that was computed earlier for the above data. When it is NULL, it is computed internally using model and data.
features: a vector of either column indices or of feature names to plot. When it is NULL, feature importance is calculated, and top_n high ranked features are taken.
top_n: when features is NULL, top_n [1, 100] most important features in a model are taken.
model: an xgb.Booster model. It has to be provided when either shap_contrib or features is missing.
trees: passed to xgb.importance when features = NULL.
target_class: is only relevant for multiclass models. When it is set to a 0-based class index, only SHAP contributions for that specific class are used. If it is not set, SHAP importances are averaged over all classes.
approxcontrib: passed to predict.xgb.Booster when shap_contrib = NULL.
subsample: a random fraction of data points to use for plotting. When it is NULL, it is set so that up to 100K data points are used.