This function provids a visual illustration for the process of minimizing a real-valued function through Gradient Descent Algorithm.
grad.desc(
FUN = function(x, y) x^2 + 2 * y^2,
rg = c(-3, -3, 3, 3),
init = c(-3, 3),
gamma = 0.05,
tol = 0.001,
gr = NULL,
len = 50,
interact = FALSE,
col.contour = "red",
col.arrow = "blue",
main
)
a bivariate objective function to be minimized (variable names do
not have to be x
and y
); if the gradient argument gr
is NULL
, deriv
will be used to calculate the gradient,
in which case we should not put braces around the function body of
FUN
(e.g. the default function is function(x, y) x^2 + 2 *
y^2
)
ranges for independent variables to plot contours; in a c(x0,
y0, x1, y1)
form
starting values
size of a step
tolerance to stop the iterations, i.e. the minimum difference between \(F(x_i)\) and \(F(x_{i+1})\)
the gradient of FUN
; it should be a bivariate function to
calculate the gradient (not the negative gradient!) of FUN
at a
point \((x,y)\), e.g. function(x, y) 2 * x + 4 * y
. If it is
NULL
, R will use deriv
to calculate the gradient
desired length of the independent sequences (to compute z values for contours)
logical; whether choose the starting values by clicking on the contour plot directly?
colors for the contour lines and arrows respectively (default to be red and blue)
the title of the plot; if missing, it will be derived from
FUN
A list containing
the solution for the local minimum
the value of the objective function corresponding to
par
the number of iterations; if it is equal to
ani.options('nmax')
, it's quite likely that the solution is not
reliable because the maximum number of iterations has been reached
the gradient function of the objective function
a function to make the perspective plot of the objective
function; can accept further arguments from persp
(see the
examples below)
Gradient descent is an optimization algorithm. To find a local minimum of a function using gradient descent, one takes steps proportional to the negative of the gradient (or the approximate gradient) of the function at the current point. If instead one takes steps proportional to the gradient, one approaches a local maximum of that function; the procedure is then known as gradient ascent.
The arrows are indicating the result of iterations and the process of
minimization; they will go to a local minimum in the end if the maximum
number of iterations ani.options('nmax')
has not been reached.
Examples at https://yihui.org/animation/example/grad-desc/