FTRL shalev2007primalopera and Chap. 5 of hazan2019introductionopera is the online counterpart of empirical risk minimization. It is a family of aggregation rules (including OGD) that uses at any time the empirical risk minimizer so far with an additional regularization. The online optimization can be performed on any bounded convex set that can be expressed with equality or inequality constraints. Note that this method is still under development and a beta version.
FTRL(
y,
experts,
eta = NULL,
fun_reg = NULL,
fun_reg_grad = NULL,
constr_eq = NULL,
constr_eq_jac = NULL,
constr_ineq = NULL,
constr_ineq_jac = NULL,
loss.type = list(name = "square"),
loss.gradient = TRUE,
w0 = NULL,
max_iter = 50,
obj_tol = 0.01,
training = NULL,
default = FALSE,
quiet = TRUE
)
vector
. Real observations.
matrix
. Matrix of experts previsions.
numeric
. Regularization parameter.
function
(NULL). Regularization function to be applied during the optimization.
function
(NULL). Gradient of the regularization function (to speed up the computations).
function
(NULL). Constraints (equalities) to be applied during the optimization.
function
(NULL). Jacobian of the equality constraints (to speed up the computations).
function
(NULL). Constraints (inequalities) to be applied during the optimization (... > 0).
function
(NULL). Jacobian of the inequality constraints (to speed up the computations).
character, list or function
("square").
Name of the loss to be applied ('square', 'absolute', 'percentage', or 'pinball');
List with field name
equal to the loss name. If using pinball loss, field tau
equal to the required quantile in [0,1];
A custom loss as a function of two parameters (prediction, label).
boolean, function
(TRUE).
If TRUE, the aggregation rule will not be directly applied to the loss function at hand, but to a gradient version of it. The aggregation rule is then similar to gradient descent aggregation rule.
If loss.type is a function, the derivative of the loss in its first component should be provided to be used (it is not automatically computed).
numeric
(NULL). Vector of initialization for the weights.
integer
(50). Maximum number of iterations of the optimization algorithm per round.
numeric
(1e-2). Tolerance over objective function between two iterations of the optimization.
list
(NULL). List of previous parameters.
boolean
(FALSE). Whether or not to use default parameters for fun_reg, constr_eq, constr_ineq and their grad/jac,
which values are ALL ignored when TRUE.
boolean
(FALSE). Whether or not to display progress bars.
object of class mixture.