Learn R Programming

torch (version 0.3.0)

optim_asgd: Averaged Stochastic Gradient Descent optimizer

Description

Proposed in Acceleration of stochastic approximation by averaging

Usage

optim_asgd(
  params,
  lr = 0.01,
  lambda = 1e-04,
  alpha = 0.75,
  t0 = 1e+06,
  weight_decay = 0
)

Arguments

params

(iterable): iterable of parameters to optimize or lists defining parameter groups

lr

(float): learning rate

lambda

(float, optional): decay term (default: 1e-4)

alpha

(float, optional): power for eta update (default: 0.75)

t0

(float, optional): point at which to start averaging (default: 1e6)

weight_decay

(float, optional): weight decay (L2 penalty) (default: 0)

Examples

Run this code
# NOT RUN {
if (torch_is_installed()) {
# }
# NOT RUN {
optimizer <- optim_asgd(model$parameters(), lr=0.1)
optimizer$zero_grad()
loss_fn(model(input), target)$backward()
optimizer$step()
# }
# NOT RUN {
}
# }

Run the code above in your browser using DataLab