Proposed by G. Hinton in his course
optim_rmsprop(
params,
lr = 0.01,
alpha = 0.99,
eps = 1e-08,
weight_decay = 0,
momentum = 0,
centered = FALSE
)
(iterable): iterable of parameters to optimize or list defining parameter groups
(float, optional): learning rate (default: 1e-2)
(float, optional): smoothing constant (default: 0.99)
(float, optional): term added to the denominator to improve numerical stability (default: 1e-8)
optional weight decay penalty. (default: 0)
(float, optional): momentum factor (default: 0)
(bool, optional) : if TRUE
, compute the centered RMSProp,
the gradient is normalized by an estimation of its variance
weight_decay (float, optional): weight decay (L2 penalty) (default: 0)