powered by
psi_iv_filter is for selecting important and stable features using IV & PSI.
psi_iv_filter
psi_iv_filter(dat, dat_test = NULL, target, x_list = NULL, breaks_list = NULL, pos_flag = NULL, ex_cols = NULL, occur_time = NULL, oot_pct = 0.7, psi_i = 0.1, iv_i = 0.01, vars_name = FALSE, note = FALSE, parallel = FALSE, save_data = FALSE, file_name = NULL, dir_path = tempdir(), ...)
A data.frame with independent variables and target variable.
A data.frame of test data. Default is NULL.
The name of target variable.
Names of independent variables.
A table containing a list of splitting points for each independent variable. Default is NULL.
The value of positive class of target variable, default: "1".
A list of excluded variables. Regular expressions can also be used to match variable names. Default is NULL.
The name of the variable that represents the time at which each observation takes place.
Percentage of observations retained for overtime test (especially to calculate PSI). Defualt is 0.7
The maximum threshold of PSI. 0 <= psi_i <=1; 0.05 to 0.2 usually work. Default: 0.1
The minimum threshold of IV. 0 < iv_i ; 0.01 to 0.1 usually work. Default: 0.01
Logical, output a list of filtered variables or table with detailed IV and PSI value of each variable. Default is FALSE.
Logical, outputs info. Default is TRUE.
Logical, parallel computing. Default is FALSE.
Logical, save results in locally specified folder. Default is FALSE.
The name for periodically saved results files. Default is "Featrue_importance_IV_PSI".
The path for periodically saved results files. Default is tempdir().
Other parameters.
A list with the following elements:
Feature Selected variables.
Feature
IV IV of variables.
IV
PSI PSI of variables.
PSI
xgb_filter, gbm_filter, feature_select_wrapper
xgb_filter
gbm_filter
feature_select_wrapper
# NOT RUN { psi_iv_filter(dat= UCICreditCard[1:1000,c(2,4,8:9,26)], target = "default.payment.next.month", occur_time = "apply_date", parallel = FALSE) # }
Run the code above in your browser using DataLab