Implements L-BFGS algorithm, heavily inspired by minFunc
optim_lbfgs(
params,
lr = 1,
max_iter = 20,
max_eval = NULL,
tolerance_grad = 1e-07,
tolerance_change = 1e-09,
history_size = 100,
line_search_fn = NULL
)
(iterable): iterable of parameters to optimize or dicts defining parameter groups
(float): learning rate (default: 1)
(int): maximal number of iterations per optimization step (default: 20)
(int): maximal number of function evaluations per optimization step (default: max_iter * 1.25).
(float): termination tolerance on first order optimality (default: 1e-5).
(float): termination tolerance on function value/parameter changes (default: 1e-9).
(int): update history size (default: 100).
(str): either 'strong_wolfe' or None (default: None).
This optimizer doesn't support per-parameter options and parameter groups (there can be only one).
Right now all parameters have to be on a single device. This will be improved in the future.