Learn R Programming

rgp (version 0.4-1)

multiNicheSymbolicRegression: Symbolic regression via multi-niche standard genetic programming

Description

Perform symbolic regression via untyped multi-niche genetic programming. The regression task is specified as a formula. Only simple formulas without interactions are supported. The result of the symbolic regression run is a symbolic regression model containing an untyped GP population of model functions.

Usage

multiNicheSymbolicRegression(formula, data, stopCondition = makeTimeStopCondition(25), passStopCondition = makeTimeStopCondition(5), numberOfNiches = 2, clusterFunction = groupListConsecutive, joinFunction = function(niches) Reduce(c, niches), population = NULL, populationSize = 100, eliteSize = ceiling(0.1 * populationSize), elite = list(), individualSizeLimit = 64, penalizeGenotypeConstantIndividuals = FALSE, functionSet = mathFunctionSet, constantSet = numericConstantSet, selectionFunction = makeTournamentSelection(), crossoverFunction = crossover, mutationFunction = NULL, restartCondition = makeEmptyRestartCondition(), restartStrategy = makeLocalRestartStrategy(), progressMonitor = NULL, verbose = TRUE, clusterApply = sfClusterApplyLB, clusterExport = sfExport)

Arguments

formula
A formula describing the regression task. Only simple formulas of the form response ~ variable1 + ... + variableN are supported at this point in time.
data
A data.frame containing training data for the symbolic regression run. The variables in formula must match column names in this data frame.
stopCondition
The stop condition for the evolution main loop. See makeStepsStopCondition for details.
passStopCondition
The stop condition for each parallel pass. See makeStepsStopCondition for details.
numberOfNiches
The number of niches to cluster the population into.
clusterFunction
The function used to cluster the population into niches. The first parameter of this function is a GP population, the second paramater an integer representing the number of niches. Defaults to groupListConsecutive.
joinFunction
The function used to join all niches into a population again after a round of parallel passes. Defaults to a function that simply concatenates all niches.
population
The GP population to start the run with. If this parameter is missing, a new GP population of size populationSize is created through random growth.
populationSize
The number of individuals if a population is to be created.
eliteSize
The number of "elite" individuals to keep. Defaults to ceiling(0.1 * populationSize).
elite
The elite list, must be alist of individuals sorted in ascending order by their first fitness component.
individualSizeLimit
Individuals with a number of tree nodes that exceeds this size limit will get a fitness of Inf.
penalizeGenotypeConstantIndividuals
Individuals that do not contain any input variables will get a fitness of Inf.
functionSet
The function set.
constantSet
The set of constant factory functions.
selectionFunction
The selection function to use. Defaults to tournament selection. See makeTournamentSelection for details.
crossoverFunction
The crossover function.
mutationFunction
The mutation function.
restartCondition
The restart condition for the evolution main loop. See makeFitnessStagnationRestartCondition for details.
restartStrategy
The strategy for doing restarts. See makeLocalRestartStrategy for details.
progressMonitor
A function of signature function(population, objectiveVectors, fitnessFunction, stepNumber, evaluationNumber, bestFitness, timeElapsed, ...) to be called with each evolution step. Seach heuristics may pass additional information via the ... parameter.
verbose
Whether to print progress messages.
clusterApply
The cluster apply function that is used to distribute the parallel passes to CPUs in a compute cluster.
clusterExport
A function that is used to export R variables to the nodes of a CPU cluster, defaults to snowfall's sfExport.

Value

An symbolic regression model that contains an untyped GP population.

See Also

predict.symbolicRegressionModel, geneticProgramming