Learn R Programming

SuperLearner (version 2.0-22)

SL.xgboost: XGBoost SuperLearner wrapper

Description

Supports the Extreme Gradient Boosting package for SuperLearnering, which is a variant of gradient boosted machines (GBM).

Usage

SL.xgboost(Y, X, newX, family, obsWeights, id, ntrees = 1000, max_depth = 4,
  shrinkage = 0.1, minobspernode = 10, params = list(), nthread = 1,
  verbose = 0, save_period = NULL, ...)

Arguments

Y

Outcome variable

X

Covariate dataframe

newX

Optional dataframe to predict the outcome

family

"gaussian" for regression, "binomial" for binary classification, "multinomial" for multiple classification (not yet supported).

obsWeights

Optional observation-level weights (supported but not tested)

id

Optional id to group observations from the same unit (not used currently).

ntrees

How many trees to fit. Low numbers may underfit but high numbers may overfit, depending also on the shrinkage.

max_depth

How deep each tree can be. 1 means no interactions, aka tree stubs.

shrinkage

How much to shrink the predictions, in order to reduce overfitting.

minobspernode

Minimum observations allowed per tree node, after which no more splitting will occur.

params

Many other parameters can be customized. See https://github.com/dmlc/xgboost/blob/master/doc/parameter.md

nthread

How many threads (cores) should xgboost use. Generally we want to keep this to 1 so that XGBoost does not compete with SuperLearner parallelization.

verbose

Verbosity of XGB fitting.

save_period

How often (in tree iterations) to save current model to disk during processing. If NULL does not save model, and if 0 saves model at the end.

...

Any remaining arguments (not supported though).

Details

The performance of XGBoost, like GBM, is sensitive to the configuration settings. Therefore it is best to create multiple configurations using create.SL.xgboost and allow the SuperLearner to choose the best weights based on cross-validated performance.

If you run into errors please first try installing the latest version of XGBoost from drat as described here: https://github.com/dmlc/xgboost/blob/master/doc/build.md#r-package-installation