optimizer_radam

A `Tensor` or a floating point value. or a schedule that is
a `tf$keras$optimizers$schedules$LearningRateSchedule` The learning rate.

learning_rate

A float value or a constant float tensor. The exponential decay rate for the 1st moment estimates.

beta_1

A float value or a constant float tensor. The exponential decay rate for the 2nd moment estimates.

beta_2

A small constant for numerical stability.

epsilon

A floating point value. Weight decay for each param.

weight_decay

boolean. Whether to apply AMSGrad variant of this algorithm from the paper
"On the Convergence of Adam and beyond".

amsgrad

A float value. The threshold for simple mean average.

sma_threshold

An integer. Total number of training steps. Enable warmup by setting a positive value.

total_steps

A floating point value. The proportion of increasing steps.

warmup_proportion

A floating point value. Minimum learning rate after warmup.

min_lr

Optional name for the operations created when applying gradients. Defaults to "RectifiedAdam".

name

clipnorm

clipvalue

is included for backward compatibility to allow time inverse decay of learning rate.

decay

is included for backward compatibility, recommended to use learning_rate instead.

'TensorFlow SIG Addons' <https://www.tensorflow.org/addons> is a repository
of community contributions that conform to well-established API patterns,
but implement new functionality not available in core 'TensorFlow'.
'TensorFlow' natively supports a large number of operators, layers, metrics,
losses, optimizers, and more. However, in a fast moving field like Machine Learning,
there are many interesting new developments that cannot be integrated into
core 'TensorFlow' (because their broad applicability is not yet clear, or
it is mostly used by a smaller subset of the community).

optimizer_radam: Rectified Adam (a.k.a. RAdam)

Description

Usage

Arguments

Value