A subset search algorithm inspired by biological evolution theory and natural selection.
ga_pls(y, X, GA.threshold = 10, iters = 5, popSize = 100)
Returns a vector of variable numbers corresponding to the model having lowest prediction error.
vector of response values (numeric
or factor
).
numeric predictor matrix
.
the change for a zero for mutations and initialization (default = 10). (The ratio of non-selected variables for each chromosome.)
the number of iterations (default = 5).
the population size (default = 100).
Tahir Mehmood, Kristian Hovde Liland, Solve Sæbø.
1. Building an initial population of variable sets by setting bits for each variable randomly, where bit '1' represents selection of corresponding variable while '0' presents non-selection. The approximate size of the variable sets must be set in advance.
2. Fitting a PLSR-model to each variable set and computing the performance by, for instance, a leave one out cross-validation procedure.
3. A collection of variable sets with higher performance are selected to survive until the next "generation".
4. Crossover and mutation: new variable sets are formed 1) by crossover of selected variables between the surviving variable sets, and 2) by changing (mutating) the bit value for each variable by small probability.
5. The surviving and modified variable sets form the population serving as input to point 2.
K. Hasegawa, Y. Miyashita, K. Funatsu, GA strategy for variable selection in QSAR studies: GA-based PLS analysis of calcium channel antagonists, Journal of Chemical Information and Computer Sciences 37 (1997) 306-310.
VIP
(SR/sMC/LW/RC), filterPLSR
, shaving
,
stpls
, truncation
,
bve_pls
, ga_pls
, ipw_pls
, mcuve_pls
,
rep_pls
, spa_pls
,
lda_from_pls
, lda_from_pls_cv
, setDA
.
data(gasoline, package = "pls")
# with( gasoline, ga_pls(octane, NIR, GA.threshold = 10) ) # Time-consuming
Run the code above in your browser using DataLab