Prepare data for SHAP plots. To be used in xgb.plot.shap, xgb.plot.shap.summary, etc. Internal utility function.
xgb.shap.data(
data,
shap_contrib = NULL,
features = NULL,
top_n = 1,
model = NULL,
trees = NULL,
target_class = NULL,
approxcontrib = FALSE,
subsample = NULL,
max_observations = 1e+05
)
A list containing: 'data', a matrix containing sample observations and their feature values; 'shap_contrib', a matrix containing the SHAP contribution values for these observations.
data as a matrix
or dgCMatrix
.
a matrix of SHAP contributions that was computed earlier for the above
data
. When it is NULL, it is computed internally using model
and data
.
a vector of either column indices or of feature names to plot. When it is NULL,
feature importance is calculated, and top_n
high ranked features are taken.
when features
is NULL, top_n [1, 100] most important features in a model are taken.
an xgb.Booster
model. It has to be provided when either shap_contrib
or features
is missing.
passed to xgb.importance
when features = NULL
.
is only relevant for multiclass models. When it is set to a 0-based class index, only SHAP contributions for that specific class are used. If it is not set, SHAP importances are averaged over all classes.
passed to predict.xgb.Booster
when shap_contrib = NULL
.
a random fraction of data points to use for plotting. When it is NULL, it is set so that up to 100K data points are used.