"MaxControl"
This is the structure that holds the optimization control options.
The corresponding constructors take
the parameters, perform consistency checks, and return the
control structure. Alternatively, it overwrites the supplied
parameters in an existing MaxControl
structure. There is also
a method to extract the control structure from the estimated
‘maxim’-objects.
The default values and definition of the slots:
1e-8, stopping condition
for maxNR
and related optimizers.
Stop if the absolute difference
between successive iterations is less than tol
, returns
code 2.
sqrt(.Machine$double.eps), relative convergence
tolerance (used by maxNR
related optimizers, and
optim
-based optimizers.
The algorithm stops if
it iteration increases the value by less than a factor of
reltol*(abs(val) + reltol)
.
Returns code 2.
1e-6, stopping condition
for maxNR
and related optimizers.
Stops if norm of the gradient is
less than gradtol
, returns code 1.
1e-10, stopping/error condition
for maxNR
and related optimizers.
If qac == "stephalving"
and the quadratic
approximation leads to a worse, instead of a better value, or to
NA
, the step length
is halved and a new attempt is made. If necessary, this procedure is repeated
until step < steptol
, thereafter code 3 is returned.
%
1e-6, (for maxNR
related
optimizers)
controls whether Hessian is treated as negative
definite. If the
largest of the eigenvalues of the Hessian is larger than
-lambdatol
(Hessian is not negative definite),
a suitable diagonal matrix is subtracted from the
Hessian (quadratic hill-climbing) in order to enforce negative
definiteness.
%
"stephalving", character, Qadratic Approximation
Correction for maxNR
related optimizers. When the new
guess is worse than the initial one, program attempts to correct it:
"stephalving"
decreases the
step but keeps the direction.
"marquardt"
uses
Marquardt (1963) method by decreasing the step length while also
moving closer to the pure gradient direction. It may be faster and
more robust choice in areas where quadratic approximation behaves poorly.
1e-10, QR-decomposition tolerance
for Hessian inversion in maxNR
related optimizers.
0.01, a positive numeric, initial correction term
for Marquardt (1963) correction in
maxNR
-related optimizers
2, how much the Marquardt (1963)
correction is decreased/increased at
successful/unsuccesful step
for maxNR
related optimizers
1e12, maximum allowed correction term
for maxNR
related optimizers.
If exceeded, the
algorithm exits with return code 3.
%
1, Nelder-Mead simplex method reflection factor (see Nelder & Mead, 1965)
0.5, Nelder-Mead contraction factor
2, Nelder-Mead expansion factor
% SANN
NULL
or a function for "SANN"
algorithm
to generate a new candidate point;
if NULL
, Gaussian Markov kernel is used
(see argument gr
of optim
).
10, starting temperature
for the “SANN” cooling schedule. See optim
.
10, number of function evaluations at each temperature for
the “SANN” optimizer. See optim
.
123, integer to seed random numbers to
ensure replicability of “SANN” optimization and preserve
R
random numbers. Use
options like SANN_randomSeed=Sys.time()
or
SANN_randomeSeed=sample(1000,1)
if you want stochastic results.
% SG general General options for stochastic gradient methods:
0.1, learning rate, numeric
NULL
, batch size for Stochastic Gradient Ascent. A
positive integer, or NULL
for full-batch gradent ascent.
NULL
, gradient clipping threshold. This is
the max allowed squared Euclidean norm of the gradient. If the
actual norm of the gradient exceeds (square root of) this
threshold, the gradient will be scaled back accordingly while
preserving its direction. NULL
means no clipping.
NULL
, or integer. Stopping condition: if
the objective function is worse than its largest value so far this
many times, the algorithm stops, and returns not the last
parameter value but the one that
gave the best results so far. This is mostly useful if gradient
is computed on training data and the
objective function on validation data.
1L, integer. After how many epochs to check the patience value. 1 means to check (and hence to compute the objective function) at each epoch.
% Stochastic Gradient Ascent Options for SGA:
0, numeric momentum parameter for SGA. Must lie in interval \([0,1]\).
% Adam Options for Adam:
0.9, numeric in \([0,1]\), the first moment momentum
0.999, numeric in \([0,1]\), the second moment momentum
% general General options:
150, stopping condition (the default differs for
different methods). Stop if more than iterlim
iterations performed. Note that ‘iteration’ may mean
different things for different optimizers.
20, maximum number of matrix rows to be printed when requesting verbosity in the optimizers.
7, maximum number of columns to be printed. This also applies to vectors that are printed horizontally.
0, the level of verbosity. Larger values print
more information. Result depends on the optimizer. Form
print.level
is also accepted by the methods for
compatibility.
FALSE
, whether to store and return the
parameter
values at each epoch. If TRUE
, the stored values
can be retrieved with storedParameters
-method. The
parameters are stored as a matrix with rows corresponding to the
epochs and columns to the parameter components.
FALSE
, whether to store and return the objective
function values at each epoch. If TRUE
, the stored values
can be retrieved with storedValues
-method.
(...)
creates a “MaxControl” object. The
arguments must be in the form option1 = value1, option2 =
value2, ...
. The options should be slot names, but the method
also supports selected other parameter forms for compatibility reasons
e.g. “print.level” instead of “printLevel”.
In case there are more than one option with
similar name, the last one overwrites the previous values. This
allows the user to override default parameters in the control
list. See example in maxLik-package.
(x = "MaxControl", ...)
overwrites parameters
of an existing “MaxControl” object. The ‘...’
argument must be in the form option1 = value1, option2 =
value2, ...
. In case there are more than one option with
similar name, only the last one is taken into account. This
allows the user to override default parameters in the control
list. See example in maxLik-package.
(x = "maxim")
extracts “MaxControl”
structure from an estimated model
shows the parameter values
Ott Toomet
Typically, the control options are supplied in the form of a list, in which
case the corresponding default values are overwritten by the
user-specified ones. However, one may also create the control
structure by maxControl(opt1=value1, opt2=value2, ...)
and
supply such value directly to the optimizer. In this case the
optimization routine takes all the values from the control object.
Nelder, J. A. & Mead, R. A (1965) Simplex Method for Function Minimization The Computer Journal 7, 308--313
Marquardt, D. W. (1963) An Algorithm for Least-Squares Estimation of Nonlinear Parameters Journal of the Society for Industrial and Applied Mathematics 11, 431--441
library(maxLik)
## Create a 'maxControl' object:
maxControl(tol=1e-4, sann_tmax=7, printLevel=2)
## Optimize quadratic form t(D) %*% W %*% D with p.d. weight matrix,
## s.t. constraints sum(D) = 1
quadForm <- function(D) {
return(-t(D) %*% W %*% D)
}
eps <- 0.1
W <- diag(3) + matrix(runif(9), 3, 3)*eps
D <- rep(1/3, 3)
# initial values
## create control object and use it for optimization
co <- maxControl(printLevel=2, qac="marquardt", marquardt_lambda0=1)
res <- maxNR(quadForm, start=D, control=co)
print(summary(res))
## Now perform the same with no trace information
co <- maxControl(co, printLevel=0)
res <- maxNR(quadForm, start=D, control=co) # no tracing information
print(summary(res)) # should be the same as above
maxControl(res) # shows the control structure
Run the code above in your browser using DataLab