MachineShop (version 2.8.0)

XGBModel: Extreme Gradient Boosting Models


Fits models within an efficient implementation of the gradient boosting framework from Chen & Guestrin.


XGBModel(params = list(), nrounds = 1, verbose = 0, print_every_n = 1)

XGBDARTModel( objective = NULL, aft_loss_distribution = "normal", aft_loss_distribution_scale = 1, base_score = 0.5, eta = 0.3, gamma = 0, max_depth = 6, min_child_weight = 1, max_delta_step = .(0.7 * is(y, "PoissonVariate")), subsample = 1, colsample_bytree = 1, colsample_bylevel = 1, colsample_bynode = 1, lambda = 1, alpha = 0, tree_method = "auto", sketch_eps = 0.03, scale_pos_weight = 1, refresh_leaf = 1, process_type = "default", grow_policy = "depthwise", max_leaves = 0, max_bin = 256, num_parallel_tree = 1, sample_type = "uniform", normalize_type = "tree", rate_drop = 0, one_drop = 0, skip_drop = 0, ... )

XGBLinearModel( objective = NULL, aft_loss_distribution = "normal", aft_loss_distribution_scale = 1, base_score = 0.5, lambda = 0, alpha = 0, updater = "shotgun", feature_selector = "cyclic", top_k = 0, ... )

XGBTreeModel( objective = NULL, aft_loss_distribution = "normal", aft_loss_distribution_scale = 1, base_score = 0.5, eta = 0.3, gamma = 0, max_depth = 6, min_child_weight = 1, max_delta_step = .(0.7 * is(y, "PoissonVariate")), subsample = 1, colsample_bytree = 1, colsample_bylevel = 1, colsample_bynode = 1, lambda = 1, alpha = 0, tree_method = "auto", sketch_eps = 0.03, scale_pos_weight = 1, refresh_leaf = 1, process_type = "default", grow_policy = "depthwise", max_leaves = 0, max_bin = 256, num_parallel_tree = 1, ... )



list of model parameters as described in the XGBoost documentation.


maximum number of boosting iterations.


numeric value controlling the amount of output printed during model fitting, such that 0 = none, 1 = performance information, and 2 = additional information.


numeric value designating the fitting iterations at at which to print output when verbose > 0.


character string specifying the learning task and objective. Possible values for supported response variable types are as follows.


"multi:softprob", "binary:logistic" (2 levels only)


"reg:squarederror", "reg:logistic", "reg:gamma", "reg:tweedie", "rank:pairwise", "rank:ndcg", "rank:map"




"survival:cox", "survival:aft"

The first values listed are the defaults for the corresponding response types.


character string specifying the distribution for the accelerated failure time objective ("survival:aft") as "normal", "logistic", or "extreme".


numeric scaling parameter for the accelerated failure time distribution.


initial numeric prediction score of all instances, global bias.

eta, gamma, max_depth, min_child_weight, max_delta_step, subsample, colsample_bytree, colsample_bylevel, colsample_bynode, lambda, alpha, tree_method, sketch_eps, scale_pos_weight, refresh_leaf, process_type, grow_policy, max_leaves, max_bin, num_parallel

see params reference.


arguments passed to XGBModel.


MLModel class object.


Response Types:

factor, numeric, PoissonVariate, Surv

Automatic Tuning of Grid Parameters

  • XGBDARTModel: nrounds, max_depth, eta, gamma*, min_child_weight*, subsample, colsample_bytree, rate_drop, skip_drop

  • XGBLinearModel: nrounds, lambda, alpha

  • XGBTreeModel: nrounds, max_depth, eta, gamma*, min_child_weight*, subsample, colsample_bytree

* included only in randomly sampled grid points

Default values for the NULL arguments and further model details can be found in the source link below.

In calls to varimp for XGBTreeModel, argument metric may be specified as "Gain" (default) for the fractional contribution of each predictor to the total gain of its splits, as "Cover" for the number of observations related to each predictor, or as "Frequency" for the percentage of times each predictor is used in the trees. Variable importance is automatically scaled to range from 0 to 100. To obtain unscaled importance values, set scale = FALSE. See example below.

See Also

xgboost, fit, resample


## Requires prior installation of suggested package xgboost to run

model_fit <- fit(Species ~ ., data = iris, model = XGBTreeModel)
varimp(model_fit, metric = "Frequency", scale = FALSE)
