Adagrad optimizer as described in Adaptive Subgradient Methods for Online Learning and Stochastic Optimization.
optimizer_adagrad(
learning_rate = 0.01,
epsilon = NULL,
decay = 0,
clipnorm = NULL,
clipvalue = NULL,
...
)
float >= 0. Learning rate.
float >= 0. Fuzz factor. If NULL
, defaults to k_epsilon()
.
float >= 0. Learning rate decay over each update.
Gradients will be clipped when their L2 norm exceeds this value.
Gradients will be clipped when their absolute value exceeds this value.
Unused, present only for backwards compatability
Other optimizers:
optimizer_adadelta()
,
optimizer_adamax()
,
optimizer_adam()
,
optimizer_nadam()
,
optimizer_rmsprop()
,
optimizer_sgd()