Learn R Programming

creditmodel (version 1.0)

xgb_filter: Select Features using XGB

Description

xgb_filter is for selecting important features using xgboost.

Usage

xgb_filter(dat_train, dat_test = NULL, target = "flag",
  pos_flag = NULL, x_list = NULL, occur_time = NULL,
  ex_cols = NULL, xgb_params = list(nrounds = 1000, max.depth = 6, eta
  = 0.1, min_child_weight = 1, subsample = 1, colsample_bytree = 1, gamma =
  0, max_delta_step = 0, early_stopping_rounds = 100, eval_metric = "auc",
  objective = "binary:logistic"), cv_folds = 3, cp = NULL, seed = 46,
  vars_name = TRUE, note = FALSE, save_data = FALSE,
  file_name = NULL, dir_path = tempdir(), ...)

Arguments

dat_train

A data.frame with independent variables and target variable.

dat_test

A data.frame of test data. Default is NULL.

target

The name of target variable.

pos_flag

The value of positive class of target variable, default: "1".

x_list

Names of independent variables.

occur_time

The name of the variable that represents the time at which each observation takes place.

ex_cols

A list of excluded variables. Regular expressions can also be used to match variable names. Default is NULL.

xgb_params

Parameters of xgboost.The complete list of parameters is available at: http://xgboost.readthedocs.io/en/latest/parameter.html.

cv_folds

Number of cross-validations. Default: 5.

cp

Threshold of XGB feature's Gain. Default is 1/number of independent variables.

seed

Random number seed. Default is 46.

vars_name

Logical, output a list of filtered variables or table with detailed IV and PSI value of each variable. Default is FALSE.

note

Logical, outputs info. Default is TRUE.

save_data

Logical, save results results in locally specified folder. Default is TRUE

file_name

The name for periodically saved results files. Default is "Featrue_importance_XGB".

dir_path

The path for periodically saved results files. Default is "./variable".

...

Other parameters to pass to xgb_params.

Value

Selected variables.

See Also

psi_iv_filter, gbm_filter, feature_select_wrapper

Examples

Run this code
# NOT RUN {
xgb_filter(dat_train = UCICreditCard[1:1000,c(2,4,8:9,26)], dat_test = NULL,
target = "default.payment.next.month", occur_time = "apply_date",cv_folds = 1,
ex_cols = "ID$|date$|default.payment.next.month$", vars_name = FALSE)
# }

Run the code above in your browser using DataLab