Learn R Programming

Laurae (version 0.0.0.9001)

lgbm.fi: LightGBM Feature Importance

Description

This function allows to get the feature importance on a LightGBM model. The model file must be "workingdir", where "workingdir" is the folder and input_model is the model file name.

Usage

lgbm.fi(model, workingdir = ifelse(is.list(model), model[["Path"]], getwd()),
  input_model = ifelse(is.list(model), model[["Name"]], "lgbm_model.txt"),
  feature_names = NA, ntreelimit = 0, data.table = TRUE)

Arguments

model
Type: list. The model file. If a character vector is provided, it is considered to be the model which is going to be saved as input_model. If a list is provided, it is used to setup to fetch the correct variables, which you can override by setting the arguments manually. If a single value is provided (like NA), then it is ignored and uses the other arguments to fetch the model locally.
workingdir
Type: character. The working directory of the model file. Defaults to ifelse(is.list(model), model[["Path"]], getwd()), which means "take the model working directory if provided the model list, else take the default working directory".
input_model
Type: character. The file name of the model. Defaults to ifelse(is.list(model), model[["Name"]], 'lgbm_model.txt'), which means "take the input model name if provided the model list, else take "lgbm_model.txt".
feature_names
Type: vector of characters. The names of the features, in the order they were fed to LightGBM. Returns column numbers if left as NA. Defaults to NA.
ntreelimit
Type: integer. The number of trees to select, starting from the first tree. Defaults to 0.
data.table
Type: boolean. Whether to return a data.table (TRUE) or a data.frame (FALSE). Defaults to TRUE.

Value

A data.table (or data.frame) with 9 columns: c("Feature", "Gain", "Gain_Rel_Ratio", "Gain_Abs_Ratio", "Gain_Std", "Gain_Std_Rel_Ratio", "Gain_Std_Abs_Ratio", "Freq", "Freq_Rel_Ratio", "Freq_Abs_Ratio")

Examples

Run this code
## Not run: ------------------------------------
# # Feature importance on a single model without any tree limit.
# lgbm.fi(model = trained, feature_names = colnames(data), ntreelimit = 0)
# 
# # Feature importance on the first model from a cross-validation without any tree limit.
# lgbm.fi(model = trained.cv[["Models"]][[1]], feature_names = colnames(data))
## ---------------------------------------------

Run the code above in your browser using DataLab