Learn R Programming

salad (version 1.2)

gradient.descent: Gradient descent

Description

A simple implementation of the gradient descent algorithm

Usage

gradient.descent(
  par,
  fn,
  ...,
  step = 0.1,
  maxit = 100,
  reltol = sqrt(.Machine$double.eps),
  trace = FALSE
)

Value

a list with components: 'par' is the final value of the parameter, 'value' is the value of 'f' at 'par', 'counts' is the number of iterations performed, 'convergence' is '0' is the convergence criterion was met. If 'trace' is 'TRUE', an extra component 'trace' is included, which is a matrix giving the successive values of \(x_n\).

Arguments

par

Initial value

fn

A function to be minimized (or maximized if 'step' < 0)

...

Further arguments to be passed to 'fn'

step

Step size. Use a negative value to perform a gradient ascent.

maxit

Maximum number of iterations

reltol

Relative convergence tolerance

trace

If 'TRUE', keep trace of the visited points

Details

First note that this is not an efficient optimisation method. It is included in the package as a demonstration only.

The function iterates \(x_{n+1} = x_{n} - step \times grad f(x_n)\) until convergence. The gradient is computed using automatic differentiation.

The convergence criterion is as in optim \( \frac{ |f(x_{n+1}) - f(x_n)| }{ |f(x[n])| + reltol } < reltol \).

Examples

Run this code
f <- function(x) (x[1] - x[2])**4 + (x[1] + 2*x[2])**2 + x[1] + x[2]

X <- seq(-1, .5, by = 0.01)
Y <- seq(-0.5, 0.5, by = 0.01)
Z <- matrix(NA_real_, nrow = length(X), ncol = length(Y))
for(i in seq_along(X)) for(j in seq_along(Y)) Z[i,j] <- f(c(X[i],Y[j]))

par(mfrow = c(2,2), mai = c(1,1,1,1)/3)
contour(X,Y,Z, levels = c(-0.2, 0, 0.3, 2**(0:6)), main = "step = 0.01")
gd1 <- gradient.descent(c(0,0), f, step = 0.01, trace = TRUE)
lines(t(gd1$trace), type = "o", col = "red")

contour(X,Y,Z, levels = c(-0.2, 0, 0.3, 2**(0:6)))
gd2 <- gradient.descent(c(0,0), f, step = 0.1, trace = TRUE)
lines(t(gd2$trace), type = "o", col = "red")

contour(X,Y,Z, levels = c(-0.2, 0, 0.3, 2**(0:6)))
gd3 <- gradient.descent(c(0,0), f, step = 0.18, trace = TRUE)
lines(t(gd3$trace), type = "o", col = "red")

contour(X,Y,Z, levels = c(-0.2, 0, 0.3, 2**(0:6)))
gd4 <- gradient.descent(c(0,0), f, step = 0.2, trace = TRUE)
lines(t(gd4$trace), type = "o", col = "red")

Run the code above in your browser using DataLab