Learn R Programming

lightgbm (version 4.5.0)

lgb.configure_fast_predict: Configure Fast Single-Row Predictions

Description

Pre-configures a LightGBM model object to produce fast single-row predictions for a given input data type, prediction type, and parameters.

Usage

lgb.configure_fast_predict(
  model,
  csr = FALSE,
  start_iteration = NULL,
  num_iteration = NULL,
  type = "response",
  params = list()
)

Value

The same model that was passed as input, invisibly, with the desired configuration stored inside it and available to be used in future calls to

predict.lgb.Booster.

Arguments

model

LighGBM model object (class lgb.Booster).

The object will be modified in-place.

csr

Whether the prediction function is going to be called on sparse CSR inputs. If FALSE, will be assumed that predictions are going to be called on single-row regular R matrices.

start_iteration

int or None, optional (default=None) Start index of the iteration to predict. If None or <= 0, starts from the first iteration.

num_iteration

int or None, optional (default=None) Limit number of iterations in the prediction. If None, if the best iteration exists and start_iteration is None or <= 0, the best iteration is used; otherwise, all iterations from start_iteration are used. If <= 0, all iterations from start_iteration are used (no limits).

type

Type of prediction to output. Allowed types are:

  • "response": will output the predicted score according to the objective function being optimized (depending on the link function that the objective uses), after applying any necessary transformations - for example, for objective="binary", it will output class probabilities.

  • "class": for classification objectives, will output the class with the highest predicted probability. For other objectives, will output the same as "response". Note that "class" is not a supported type for lgb.configure_fast_predict (see the documentation of that function for more details).

  • "raw": will output the non-transformed numbers (sum of predictions from boosting iterations' results) from which the "response" number is produced for a given objective function - for example, for objective="binary", this corresponds to log-odds. For many objectives such as "regression", since no transformation is applied, the output will be the same as for "response".

  • "leaf": will output the index of the terminal node / leaf at which each observations falls in each tree in the model, outputted as integers, with one column per tree.

  • "contrib": will return the per-feature contributions for each prediction, including an intercept (each feature will produce one column).

Note that, if using custom objectives, types "class" and "response" will not be available and will default towards using "raw" instead.

If the model was fit through function lightgbm and it was passed a factor as labels, passing the prediction type through params instead of through this argument might result in factor levels for classification objectives not being applied correctly to the resulting output.

New in version 4.0.0

params

a list of additional named parameters. See the "Predict Parameters" section of the documentation for a list of parameters and valid values. Where these conflict with the values of keyword arguments to this function, the values in params take precedence.

Details

Calling this function multiple times with different parameters might not override the previous configuration and might trigger undefined behavior.

Any saved configuration for fast predictions might be lost after making a single-row prediction of a different type than what was configured (except for types "response" and "class", which can be switched between each other at any time without losing the configuration).

In some situations, setting a fast prediction configuration for one type of prediction might cause the prediction function to keep using that configuration for single-row predictions even if the requested type of prediction is different from what was configured.

Note that this function will not accept argument type="class" - for such cases, one can pass type="response" to this function and then type="class" to the predict function - the fast configuration will not be lost or altered if the switch is between "response" and "class".

The configuration does not survive de-serializations, so it has to be generated anew in every R process that is going to use it (e.g. if loading a model object through readRDS, whatever configuration was there previously will be lost).

Requesting a different prediction type or passing parameters to predict.lgb.Booster will cause it to ignore the fast-predict configuration and take the slow route instead (but be aware that an existing configuration might not always be overriden by supplying different parameters or prediction type, so make sure to check that the output is what was expected when a prediction is to be made on a single row for something different than what is configured).

Note that, if configuring a non-default prediction type (such as leaf indices), then that type must also be passed in the call to predict.lgb.Booster in order for it to use the configuration. This also applies for start_iteration and num_iteration, but the params list must be empty in the call to predict.

Predictions about feature contributions do not allow a fast route for CSR inputs, and as such, this function will produce an error if passing csr=TRUE and type = "contrib" together.

Examples

Run this code
# \donttest{
setLGBMthreads(2L)
data.table::setDTthreads(1L)
library(lightgbm)
data(mtcars)
X <- as.matrix(mtcars[, -1L])
y <- mtcars[, 1L]
dtrain <- lgb.Dataset(X, label = y, params = list(max_bin = 5L))
params <- list(
  min_data_in_leaf = 2L
  , num_threads = 2L
)
model <- lgb.train(
  params = params
 , data = dtrain
 , obj = "regression"
 , nrounds = 5L
 , verbose = -1L
)
lgb.configure_fast_predict(model)

x_single <- X[11L, , drop = FALSE]
predict(model, x_single)

# Will not use it if the prediction to be made
# is different from what was configured
predict(model, x_single, type = "leaf")
# }

Run the code above in your browser using DataLab