Compare SHAP contributions of different features.
xgb.ggplot.shap.summary(
data,
shap_contrib = NULL,
features = NULL,
top_n = 10,
model = NULL,
trees = NULL,
target_class = NULL,
approxcontrib = FALSE,
subsample = NULL
)xgb.plot.shap.summary(
data,
shap_contrib = NULL,
features = NULL,
top_n = 10,
model = NULL,
trees = NULL,
target_class = NULL,
approxcontrib = FALSE,
subsample = NULL
)
data as a matrix
or dgCMatrix
.
a matrix of SHAP contributions that was computed earlier for the above
data
. When it is NULL, it is computed internally using model
and data
.
a vector of either column indices or of feature names to plot. When it is NULL,
feature importance is calculated, and top_n
high ranked features are taken.
when features
is NULL, top_n [1, 100] most important features in a model are taken.
an xgb.Booster
model. It has to be provided when either shap_contrib
or features
is missing.
passed to xgb.importance
when features = NULL
.
is only relevant for multiclass models. When it is set to a 0-based class index, only SHAP contributions for that specific class are used. If it is not set, SHAP importances are averaged over all classes.
passed to predict.xgb.Booster
when shap_contrib = NULL
.
a random fraction of data points to use for plotting. When it is NULL, it is set so that up to 100K data points are used.
A ggplot2
object.
A point plot (each point representing one sample from data
) is
produced for each feature, with the points plotted on the SHAP value axis.
Each point (observation) is coloured based on its feature value. The plot
hence allows us to see which features have a negative / positive contribution
on the model prediction, and whether the contribution is different for larger
or smaller values of the feature. We effectively try to replicate the
summary_plot
function from https://github.com/slundberg/shap.
xgb.plot.shap
, xgb.ggplot.shap.summary
,
https://github.com/slundberg/shap
# NOT RUN {
# See \code{\link{xgb.plot.shap}}.
# }
Run the code above in your browser using DataLab