Fits models with an efficient implementation of the gradient boosting framework from Chen & Guestrin.
XGBModel(
nrounds = 100,
...,
objective = character(),
aft_loss_distribution = "normal",
aft_loss_distribution_scale = 1,
base_score = 0.5,
verbose = 0,
print_every_n = 1
)XGBDARTModel(
eta = 0.3,
gamma = 0,
max_depth = 6,
min_child_weight = 1,
max_delta_step = .(0.7 * is(y, "PoissonVariate")),
subsample = 1,
colsample_bytree = 1,
colsample_bylevel = 1,
colsample_bynode = 1,
alpha = 0,
lambda = 1,
tree_method = "auto",
sketch_eps = 0.03,
scale_pos_weight = 1,
refresh_leaf = 1,
process_type = "default",
grow_policy = "depthwise",
max_leaves = 0,
max_bin = 256,
num_parallel_tree = 1,
sample_type = "uniform",
normalize_type = "tree",
rate_drop = 0,
one_drop = 0,
skip_drop = 0,
...
)
XGBLinearModel(
alpha = 0,
lambda = 0,
updater = "shotgun",
feature_selector = "cyclic",
top_k = 0,
...
)
XGBTreeModel(
eta = 0.3,
gamma = 0,
max_depth = 6,
min_child_weight = 1,
max_delta_step = .(0.7 * is(y, "PoissonVariate")),
subsample = 1,
colsample_bytree = 1,
colsample_bylevel = 1,
colsample_bynode = 1,
alpha = 0,
lambda = 1,
tree_method = "auto",
sketch_eps = 0.03,
scale_pos_weight = 1,
refresh_leaf = 1,
process_type = "default",
grow_policy = "depthwise",
max_leaves = 0,
max_bin = 256,
num_parallel_tree = 1,
...
)
number of boosting iterations.
model parameters as described below and in the XGBoost
documentation
and arguments passed to XGBModel
from the other constructors.
optional character string defining the learning task and objective. Set automatically if not specified according to the following values available for supported response variable types.
factor
:"multi:softprob"
, "binary:logistic"
(2 levels only)
numeric
:"reg:squarederror"
, "reg:logistic"
,
"reg:gamma"
, "reg:tweedie"
, "rank:pairwise"
,
"rank:ndcg"
, "rank:map"
PoissonVariate
:"count:poisson"
Surv
:"survival:aft"
, "survival:cox"
The first values listed are the defaults for the corresponding response types.
character string specifying a distribution for
the accelerated failure time objective ("survival:aft"
) as
"extreme"
, "logistic"
, or "normal"
.
numeric scaling parameter for the accelerated failure time distribution.
initial prediction score of all observations, global bias.
numeric value controlling the amount of output printed during model fitting, such that 0 = none, 1 = performance information, and 2 = additional information.
numeric value designating the fitting iterations at
at which to print output when verbose > 0
.
shrinkage of variable weights at each iteration to prevent overfitting.
minimum loss reduction required to split a tree node.
maximum tree depth.
minimum sum of observation weights required of nodes.
other tree booster parameters.
subsample ratio of the training observations.
subsample ratio of variables for each tree, level, or split.
L1 and L2 regularization terms for variable weights.
type of sampling and normalization algorithms.
rate at which to drop trees during the dropout procedure.
integer indicating whether to drop at least one tree during the dropout procedure.
probability of skipping the dropout procedure during a boosting iteration.
character string specifying the feature
selection and ordering method, and number of top variables to select in the
"greedy"
and "thrifty"
feature selectors.
MLModel
class object.
factor
, numeric
,
PoissonVariate
, Surv
XGBDARTModel: nrounds
, eta
*, gamma
*,
max_depth
, min_child_weight
*, subsample
*,
colsample_bytree
*, rate_drop
*, skip_drop
*
XGBLinearModel: nrounds
, alpha
, lambda
XGBTreeModel: nrounds
, eta
*, gamma
*,
max_depth
, min_child_weight
*, subsample
*,
colsample_bytree
*
* excluded from grids by default
Default values and further model details can be found in the source link below.
In calls to varimp
for XGBTreeModel
, argument
type
may be specified as "Gain"
(default) for the fractional
contribution of each predictor to the total gain of its splits, as
"Cover"
for the number of observations related to each predictor, or
as "Frequency"
for the percentage of times each predictor is used in
the trees. Variable importance is automatically scaled to range from 0 to
100. To obtain unscaled importance values, set scale = FALSE
. See
example below.
# NOT RUN {
## Requires prior installation of suggested package xgboost to run
model_fit <- fit(Species ~ ., data = iris, model = XGBTreeModel)
varimp(model_fit, method = "model", type = "Frequency", scale = FALSE)
# }
# NOT RUN {
# }
Run the code above in your browser using DataLab