The original optimizer that will be used to compute and apply the gradients.
sync_period
An integer. The synchronization period of lookahead. Enable lookahead mechanism
by setting it with a positive value.
slow_step_size
A floating point value. The ratio for updating the slow weights.
name
Optional name for the operations created when applying gradients. Defaults to "Lookahead".
clipnorm
is clip gradients by norm.
clipvalue
is clip gradients by value.
decay
is included for backward compatibility to allow time inverse decay of learning rate.
lr
is included for backward compatibility, recommended to use learning_rate instead.
Value
Optimizer for use with `keras::compile()`
Details
The mechanism is proposed by Michael R. Zhang et.al in the paper
[Lookahead Optimizer: k steps forward, 1 step back](https://arxiv.org/abs/1907.08610v1).
The optimizer iteratively updates two sets of weights: the search directions for weights
are chosen by the inner optimizer, while the "slow weights" are updated each k steps based
on the directions of the "fast weights" and the two sets of weights are synchronized.
This method improves the learning stability and lowers the variance of its inner optimizer.