str or tf$keras$optimizers$Optimizer that will be used to compute
and apply gradients.
sequential_update
Bool. If False, will compute the moving average at the same
time as the model is updated, potentially doing benign data races. If True, will update
the moving average after gradient updates.
average_decay
float. Decay to use to maintain the moving averages of trained variables.
num_updates
Optional count of the number of updates applied to variables.
name
Optional name for the operations created when applying gradients.
Defaults to "MovingAverage".
clipnorm
is clip gradients by norm.
clipvalue
is clip gradients by value.
decay
is included for backward compatibility to allow time inverse decay of learning rate.
lr
is included for backward compatibility, recommended to use learning_rate instead.
Value
Optimizer for use with `keras::compile()`
Details
Optimizer that computes a moving average of the variables.
Empirically it has been found that using the moving average of the
trained parameters of a deep network is better than using its trained
parameters directly. This optimizer allows you to compute this moving
average and swap the variables at save time so that any code outside
of the training loop will use by default the average values
instead of the original ones.