Parametrize and/or tune biomod2's single models options.
BIOMOD_ModelingOptions(
GLM = NULL,
GBM = NULL,
GAM = NULL,
CTA = NULL,
ANN = NULL,
SRE = NULL,
FDA = NULL,
MARS = NULL,
RF = NULL,
MAXENT = NULL,
XGBOOST = NULL
)bm_DefaultModelingOptions()
A BIOMOD.models.options object that can be used to build species distribution
model(s) with the BIOMOD_Modeling function.
(optional, default NULL)
A list containing GLM options
(optional, default NULL)
A list containing GBM options
(optional, default NULL)
A list containing GAM options
(optional, default NULL)
A list containing CTA options
(optional, default NULL)
A list containing ANN options
(optional, default NULL)
A list containing SRE options
(optional, default NULL)
A list containing FDA options
(optional, default NULL)
A list containing MARS options
(optional, default NULL)
A list containing RF options
(optional, default NULL)
A list
containing MAXENT options
(optional, default NULL)
A list
containing XGBOOST options
(glm)
myFormula : a typical formula object (see
Examples).
If not NULL, type
and interaction.level parameters are switched off.
You can choose to either :
generate automatically the GLM formula with the following parameters :
type = 'quadratic' : formula given to the model, must
be simple, quadratic or polynomial
interaction.level = 0 : an integer corresponding
to the interaction level between considered variables considered (be aware that
interactions quickly enlarge the number of effective variables used into the GLM !)
or construct specific formula
test = 'AIC' : information criteria for the stepwise
selection procedure, must be AIC (Akaike Information Criteria, BIC
(Bayesian Information Criteria) or none (consider only the full model,
no stepwise selection, but this can lead to convergence issue and strange results !)
family = binomial(link = 'logit') : a character
defining the error distribution and link function to be used in the model, mus be a family
name, a family function or the result of a call to a family function (see family)
(so far, biomod2 only runs on presence-absence data, so binomial family is the
default !)
control : a list of parameters to control the fitting process (passed to
glm.control)
(default gbm)
Please refer to gbm help file for more details.
distribution = 'bernoulli'
n.trees = 2500
interaction.depth = 7
n.minobsinnode = 5
shrinkage = 0.001
bag.fraction = 0.5
train.fraction = 1
cv.folds = 3
keep.data = FALSE
verbose = FALSE
perf.method = 'cv'
n.cores = 1
(gam or gam)
algo = 'GAM_gam' : a character defining the chosen GAM function, must
be GAM_gam (see gam), GAM_mgcv (see gam)
or BAM_mgcv (see bam)
myFormula : a typical formula object (see
Examples).
If not NULL, type
and interaction.level parameters are switched off.
You can choose to either :
generate automatically the GAM formula with the following parameters :
type = 's_smoother' : the smoother used to generate the formula
interaction.level = 0 : an integer corresponding
to the interaction level between considered variables considered (be aware that
interactions quickly enlarge the number of effective variables used into the GLM !)
or construct specific formula
k = -1a smooth term in a formula argument to gam, must be -1 or
4 (see gam s or mgcv s)
family = binomial(link = 'logit') : a character defining
the error distribution and link function to be used in the model, mus be a family name, a
family function or the result of a call to a family function (see family)
(so far, biomod2 only runs on presence-absence data, so binomial family is the
default !)
control : a list of parameters to control the fitting process (passed to
gam.control or gam.control)
some options specific to GAM_mgcv (ignored if algo = 'GAM_gam')
method = 'GCV.Cp')
optimizer = c('outer','newton')
select = FALSE
knots = NULL
paramPen = NULL
(rpart)
Please refer to rpart help file for more details.
method = 'class'
parms = 'default' : if 'default', default rpart
parms value are kept
cost = NULL
control : see rpart.control
(nnet)
NbCV = 5 : an integer corresponding to the number of cross-validation
repetitions to find best size and decay parameters
size = NULL : an integer corresponding to the number of units in the
hidden layer. If NULL then size parameter will be optimized by cross-validation based
on model AUC (NbCv cross-validations ; tested size will be the following :
c(2, 4, 6, 8)). It is also possible to give a vector of size values to be tested,
and the one giving the best model AUC will be kept.
decay = NULL : a numeric corresponding to weight decay. If NULL
then decay parameter will be optimized by cross-validation based on model AUC (NbCv
cross-validations ; tested size will be the following : c(0.001, 0.01, 0.05, 0.1)).
It is also possible to give a vector of decay values to be tested, and the one giving
the best model AUC will be kept.
rang = 0.1 : a numeric corresponding to the initial random weights on
[-rang, rang]
maxit = 200 : an integer corresponding to the maximum number of
iterations
(bm_SRE)
quant = 0.025 : a numeric corresponding to the quantile of
'extreme environmental variable' removed to select species envelops
(fda)
Please refer to fda help file for more details.
method = 'mars'
add_args = NULL : a list of additional parameters to method and
given to the ... options of fda function
(earth)
Please refer to earth help file for more details.
myFormula : a typical formula object (see
Examples).
If not NULL, type
and interaction.level parameters are switched off.
You can choose to either :
generate automatically the MARS formula with the following parameters :
type = 'simple' : formula given to the model, must
be simple, quadratic or polynomial
interaction.level = 0 : an integer corresponding
to the interaction level between considered variables considered (be aware that
interactions quickly enlarge the number of effective variables used into the MARS !)
or construct specific formula
nk = NULL : an integer corresponding to the maximum number of model
terms.
If NULL default MARS function value is used : max(21, 2 * nb_expl_var + 1)
penalty = 2
thresh = 0.001
nprune = NULL
pmethod = 'backward'
do.classif = TRUE : if TRUE random.forest classification will
be computed, otherwise random.forest regression will be done
ntree = 500
mtry = 'default'
sampsize = NULL
nodesize = 5
maxnodes = NULL
(https://biodiversityinformatics.amnh.org/open_source/maxent/)
path_to_maxent.jar = getwd() : a character
corresponding to maxent.jar file link
memory_allocated = 512 : an integer corresponding to
the amount of memory (in Mo) reserved for java to run
MAXENT, must be 64, 128, 256,
512, 1024... or NULL to use default java
memory limitation parameter
initial_heap_size = NULL : a character initial heap
space (shared memory space) allocated to java. Argument transmitted to
-Xms when calling java. Used in BIOMOD_Projection but
not in BIOMOD_Modeling. Values can be 1024K,
4096M, 10G ... or NULL to use default java
parameter
max_heap_size = NULL : a character initial heap
space (shared memory space) allocated to java. Argument transmitted to
-Xmx when calling java. Used in BIOMOD_Projection but
not in BIOMOD_Modeling. Must be larger than
initial_heap_size. Values can be 1024K, 4096M,
10G ... or NULL to use default java parameter
background_data_dir : a character corresponding to
directory path where explanatory variables are stored as ASCII
files (raster format). If specified, MAXENT will generate
its own background data from explanatory variables rasters (as usually
done in MAXENT studies). Otherwise biomod2 pseudo-absences
will be used (see BIOMOD_FormatingData)
maximumbackground : an integer corresponding to the
maximum number of background data to sample if the
background_data_dir parameter has been set
maximumiterations = 200 : an integer corresponding
to the maximum number of iterations to do
visible = FALSE : a logical to make the
MAXENT user interface available
linear = TRUE : a logical to allow linear features
to be used
quadratic = TRUE : a logical to allow
quadratic features to be used
product = TRUE : a logical to allow product features
to be used
threshold = TRUE : a logical to allow threshold
features to be used
hinge = TRUE : a logical to allow hinge features to
be used
lq2lqptthreshold = 80 : an integer
corresponding to the number of samples at which product and threshold
features start being used
l2lqthreshold = 10 : an
integer corresponding to the number of samples at which quadratic
features start being used
hingethreshold = 15 : an
integer corresponding to the number of samples at which hinge
features start being used
beta_threshold = -1.0 : a
numeric corresponding to the regularization parameter to be applied
to all threshold features (negative value enables automatic
setting)
beta_categorical = -1.0 : a numeric
corresponding to the regularization parameter to be applied to all
categorical features (negative value enables automatic setting)
beta_lqp = -1.0 : a numeric corresponding to the
regularization parameter to be applied to all linear, quadratic and
product features (negative value enables automatic setting)
beta_hinge = -1.0 : a numeric corresponding to the
regularization parameter to be applied to all hinge features
(negative value enables automatic setting)
betamultiplier = 1 : a numeric to multiply all
automatic regularization parameters
(higher number gives a more
spread-out distribution)
defaultprevalence = 0.5 : a
numeric corresponding to the default prevalence of the species
(probability of presence at ordinary occurrence points)
(default xgboost)
Please refer to xgboost help file for more details.
max.depth = 5
eta = 0.1
nrounds = 512
objective = "binary:logistic"
nthread = 1
Damien Georges, Wilfried Thuiller
This function allows advanced user to change some default parameters of biomod2 inner
models.
10 single models are available within the package, and their options can be set
with this function through list objects.
The bm_DefaultModelingOptions function prints all default parameter values for
all available models.
This output can be copied and pasted to be used as is (with wanted
changes) as function arguments (see Examples).
Below is the detailed list of all modifiable parameters for each available model.
BIOMOD_Tuning, BIOMOD_Modeling
Other Main functions:
BIOMOD_EnsembleForecasting(),
BIOMOD_EnsembleModeling(),
BIOMOD_FormatingData(),
BIOMOD_LoadModels(),
BIOMOD_Modeling(),
BIOMOD_PresenceOnly(),
BIOMOD_Projection(),
BIOMOD_RangeSize(),
BIOMOD_Tuning()
library(terra)
# Load species occurrences (6 species available)
data(DataSpecies)
head(DataSpecies)
# Select the name of the studied species
myRespName <- 'GuloGulo'
# Get corresponding presence/absence data
myResp <- as.numeric(DataSpecies[, myRespName])
# Get corresponding XY coordinates
myRespXY <- DataSpecies[, c('X_WGS84', 'Y_WGS84')]
# Load environmental variables extracted from BIOCLIM (bio_3, bio_4, bio_7, bio_11 & bio_12)
data(bioclim_current)
myExpl <- terra::rast(bioclim_current)
# \dontshow{
myExtent <- terra::ext(0,30,45,70)
myExpl <- terra::crop(myExpl, myExtent)
# }
# ---------------------------------------------------------------#
# Print default modeling options
bm_DefaultModelingOptions()
# Create default modeling options
myBiomodOptions <- BIOMOD_ModelingOptions()
myBiomodOptions
# # Part (or totality) of the print can be copied and customized
# # Below is an example to compute quadratic GLM and select best model with 'BIC' criterium
# myBiomodOptions <- BIOMOD_ModelingOptions(
# GLM = list(type = 'quadratic',
# interaction.level = 0,
# myFormula = NULL,
# test = 'BIC',
# family = 'binomial',
# control = glm.control(epsilon = 1e-08,
# maxit = 1000,
# trace = FALSE)))
# myBiomodOptions
#
# # It is also possible to give a specific GLM formula
# myForm <- 'Sp277 ~ bio3 + log(bio10) + poly(bio16, 2) + bio19 + bio3:bio19'
# myBiomodOptions <- BIOMOD_ModelingOptions(GLM = list(myFormula = formula(myForm)))
# myBiomodOptions
Run the code above in your browser using DataLab