Adam optimizer as described in Adam
A Method for Stochastic Optimization
.
optimizer_adam(lr = 0.001, beta_1 = 0.9, beta_2 = 0.999,
epsilon = 1e-08, decay = 0, clipnorm = NULL, clipvalue = NULL)
float >= 0. Learning rate.
The exponential decay rate for the 1st moment estimates. float, 0 < beta < 1. Generally close to 1.
The exponential decay rate for the 2nd moment estimates. float, 0 < beta < 1. Generally close to 1.
float >= 0. Fuzz factor.
float >= 0. Learning rate decay over each update.
Gradients will be clipped when their L2 norm exceeds this value.
Gradients will be clipped when their absolute value exceeds this value.
Other optimizers: optimizer_adadelta
,
optimizer_adagrad
,
optimizer_adamax
,
optimizer_nadam
,
optimizer_rmsprop
,
optimizer_sgd