Usage
symbolicRegression(formula, data, stopCondition = makeTimeStopCondition(5),
population = NULL, populationSize = 100, eliteSize = ceiling(0.1 *
populationSize), elite = list(), extinctionPrevention = FALSE,
archive = FALSE, individualSizeLimit = 64,
penalizeGenotypeConstantIndividuals = FALSE, subSamplingShare = 1,
functionSet = mathFunctionSet, constantSet = numericConstantSet,
crossoverFunction = NULL, mutationFunction = NULL,
restartCondition = makeEmptyRestartCondition(),
restartStrategy = makeLocalRestartStrategy(),
searchHeuristic = makeAgeFitnessComplexityParetoGpSearchHeuristic(),
breedingFitness = function(individual) TRUE, breedingTries = 50,
errorMeasure = rmse, progressMonitor = NULL, envir = parent.frame(),
verbose = TRUE)
Arguments
formula
A formula
describing the regression task. Only
simple formulas of the form response ~ variable1 + ... + variableN
are supported at this point in time. data
A data.frame
containing training data for the
symbolic regression run. The variables in formula
must match
column names in this data frame. population
The GP population to start the run with. If this parameter
is missing, a new GP population of size populationSize
is created
through random growth.
populationSize
The number of individuals if a population is to be
created.
eliteSize
The number of elite individuals to keep. Defaults to
ceiling(0.1 * populationSize)
.
elite
The elite list, must be alist of individuals sorted in ascending
order by their first fitness component.
extinctionPrevention
When set to TRUE
, the initialization and
selection steps will try to prevent duplicate individuals
from occurring in the population. Defaults to FALSE
, as this
operation might be expensive with larger population sizes.
archive
If set to TRUE
, all GP individuals evaluated are stored in an
archive list archiveList
that is returned as part of the result of this function.
individualSizeLimit
Individuals with a number of tree nodes that
exceeds this size limit will get a fitness of Inf
.
penalizeGenotypeConstantIndividuals
Individuals that do not contain
any input variables will get a fitness of Inf
.
subSamplingShare
The share of fitness cases $$s$$ sampled for
evaluation with each function evaluation. $$0 < s \leq 1$$ must
hold, defaults to 1.0
.
functionSet
The function set.
constantSet
The set of constant factory functions.
crossoverFunction
The crossover function.
mutationFunction
The mutation function.
searchHeuristic
The search-heuristic (i.e. optimization algorithm) to use
in the search of solutions. See the documentation for searchHeuristics
for
available algorithms.
breedingFitness
A "breeding" function. This function is applied after
every stochastic operation Op that creates or modifies an individal
(typically, Op is a initialization, mutation, or crossover operation). If
the breeding function returns TRUE<
breedingTries
In case of a boolean breedingFitness
function, the
maximum number of retries. In case of a numerical breedingFitness
function,
the number of breeding steps. Also see the documentation for the breedingFitness
paramete
errorMeasure
A function to use as an error measure, defaults to RMSE.
progressMonitor
A function of signature
function(population, fitnessValues, fitnessFunction, stepNumber, evaluationNumber,
bestFitness, timeElapsed)
to be called with each evolution step.
envir
The R environment to evaluate individuals in, defaults to
parent.frame()
.
verbose
Whether to print progress messages.