Rectified Adam (a.k.a. RAdam)
optimizer_radam(
learning_rate = 0.001,
beta_1 = 0.9,
beta_2 = 0.999,
epsilon = 1e-07,
weight_decay = 0,
amsgrad = FALSE,
sma_threshold = 5,
total_steps = 0,
warmup_proportion = 0.1,
min_lr = 0,
name = "RectifiedAdam",
clipnorm = NULL,
clipvalue = NULL,
decay = NULL,
lr = NULL
)
A `Tensor` or a floating point value. or a schedule that is a `tf$keras$optimizers$schedules$LearningRateSchedule` The learning rate.
A float value or a constant float tensor. The exponential decay rate for the 1st moment estimates.
A float value or a constant float tensor. The exponential decay rate for the 2nd moment estimates.
A small constant for numerical stability.
A floating point value. Weight decay for each param.
boolean. Whether to apply AMSGrad variant of this algorithm from the paper "On the Convergence of Adam and beyond".
A float value. The threshold for simple mean average.
An integer. Total number of training steps. Enable warmup by setting a positive value.
A floating point value. The proportion of increasing steps.
A floating point value. Minimum learning rate after warmup.
Optional name for the operations created when applying gradients. Defaults to "RectifiedAdam".
is clip gradients by norm.
is clip gradients by value.
is included for backward compatibility to allow time inverse decay of learning rate.
is included for backward compatibility, recommended to use learning_rate instead.
Optimizer for use with `keras::compile()`