.ExtraOpt_trainer, .ExtraOpt_estimate, and .ExtraOpt_prob. For plotting, check .ExtraOpt_plot for an example.ExtraOpt(f_train = .ExtraOpt_trainer, ..., f_est = .ExtraOpt_estimate,
f_prob = .ExtraOpt_prob, preInit = NULL, Ninit = 50L, Nmax = 200,
Nimprove = 10, elites = 0.9, max_elites = 150, tested_elites = 5,
elites_converge = 10, CEmax = 200, CEiter = 20, CEelite = 0.1,
CEimprove = 3, CEexploration_cont = 2, CEexploration_disc = c(2, 5),
CEexploration_decay = 0.98, maximize = TRUE, best = NULL,
cMean = NULL, cSD = NULL, cOrdinal = NULL, cMin = NULL, cMax = NULL,
cThr = 0.001, dProb = NULL, dThr = 0.999, priorsC = NULL,
priorsD = NULL, errorCode = -9999, autoExpVar = FALSE,
autoExpFile = NULL, verbose = 1, plot = NULL, debug = FALSE)ExtraOpt in ... are provided to f_train. Defaults to .ExtraOpt_trainer, which is a sample xgboost trainer.f_train.Model as the model to use for f_prob, and the Error as the loss of the estimator model. Defaults to .ExtraOpt_estimate, which is a sample xgboost variable estimator.f_est and a prior vector as inputs, and returns the predicted loss from f_est. Defaults to .ExtraOpt_prob, which is a sample xgboost estimator prediction.NULL.Ninit = 100, even if it does not guarantee a best result.200.10.elites, the lower the ability to get stuck at a local optima. However, a very low elite amount would get quickly stuck at a local optima and potentially overfit. After the initialization, a minimum of 5 sampled elites is mandatory. For instance, if Ninit = 100, then elites >= 0.05. It should do be higher than 1. If the sampling results in a decimal-valued numeric, it will take the largest value. If the sampling results in a lower than 5 numeric, it will shrink back to 5. Defaults to 0.90.5000 as it will slow down severely the next prior optimization. When elites have the same loss, the elite which was computed the earliest takes precedence over all others identical-loss elites (even if their parameters are different). Defaults to 150.1 for small steps but fast convergence speed, supposing the initialization with good enough. Defaults to 5.cThr and dThr. The larger the elites_converge, the tighter the convergence requirements. It cannot be higher than the number of tested_elites. Defaults to 10.200.20.CEmax * CEelite defines the Cross-Entropy elite population, which preferably should be equal to 10 * number of variables for stable updating of the parameter updates. Defaults to 0.1.3.2.0 nullifies the effect of noise, thus forcing a full convergence mode instead of exploring data. Defaults to c(2, 5)exp(N-1th batch * (1 - CEexploration_decay))). Must be between 0 (near instant decay) and 1 (no decay). Defaults to 0.98.TRUE.NULL.f_train.f_train.cThr, the continuous variables are supposed having converged. Once converged, the algorithm will have only one try to generate a higher threshold while optimizing. If it fails, convergence interrupts the optimization. Applies also to the cross-entropy internal optimization. Defaults to 0.001, which means the continuous variables will be supposed converged once there is no more maximum standard deviation of 0.001.i-1-th element to appear.dThr, the discrete variables are supposed having converged. Once converged, the algorithm will have only one try to generate a higher threshold while optimizing. If it fails, convergence interrupts the optimization. Applies also to the cross-entropy internal optimization, but as 1 - dThr. Defaults to 1, which means the discrete variables will be supposed converged once all discrete variables have the same probability of 1.cMean and cSD are mandatory to be filled.dProb is mandatory to be filled.f_train should be the errorCode value when no features are selected for training a supervised model. The error codes are removed from the priors. Defaults to -9999.priorsC and priorsD matrices when ExtraOpt, f_train, f_est, or f_prob is error-ing without a possible recovery. You would then be able to feed the priors and re-run without having to run again the algorithm from scratch. Defaults to FALSE. The saved variable in the global environment is called "temporary_Laurae".priorsC and priorsD matrices when ExtraOpt, f_train, f_est, or f_prob is error-ing without a possible recovery. You would then be able to feed the priors and re-run without having to run again the algorithm from scratch. Defaults to NULL.ExtraOpt become chatty and report a lot? A value of 0 defines silent, while 1 chats a little bit (and 2 chats a lot). 3 is so chatty it will flood severely. Defaults to 1."priors", which as a matrix with as first column the Loss, followed then by continuous variables, and ends with discrete variables. Continuous variables start with "C" while discrete variables start with "D" in the column names. Defaults to NULL.FALSE.best for the best value found, variables for the variable values (split into continuous list and discrete list), priors for the list of iterations and their values, elite_priors for the laste elites used, new_priors for the last iterations issued from the elites, iterations for the number of iterations, and thresh_stats for the threshold statistics over batches.## Not run: ------------------------------------
# # Example of params:
# - 50 random initializations
# - 200 maximum tries
# - 3 continuous variables in [0, 10]
# --- with 2 continuous and 1 ordinal
# --- with respective means (2, 4, 6)
# --- and standard deviation (1, 2, 3)
# - and 2 discrete features
# - with respective prior probabilities {(0.8, 0.2), (0.7, 0.1, 0.2)}
# - and loss error code (illegal priors) of -9999
#
# ExtraOpt(Ninit = 50,
# nthreads = 1,
# eta = 0.1,
# early_stop = 10,
# X_train,
# X_test,
# Y_train,
# Y_test,
# Nmax = 200,
# cMean = c(2, 4, 6),
# cSD = c(1, 2, 3),
# cOrdinal = c(FALSE, FALSE, TRUE),
# cMin = c(0, 0, 0),
# cMax = c(10, 10, 10),
# dProb = list(v1 = c(0.8, 0.2), v2 = c(0.7, 0.1, 0.2)),
# priorsC = NULL,
# priorsD = NULL,
# autoExp = FALSE,
# errorCode = -9999)
## ---------------------------------------------
Run the code above in your browser using DataLab