Parametrize and/or tune biomod2's single models options.
BIOMOD_ModelingOptions(
GLM = NULL,
GBM = NULL,
GAM = NULL,
CTA = NULL,
ANN = NULL,
SRE = NULL,
FDA = NULL,
MARS = NULL,
RF = NULL,
MAXENT = NULL,
XGBOOST = NULL
)bm_DefaultModelingOptions()
A BIOMOD.models.options
object that can be used to build species distribution
model(s) with the BIOMOD_Modeling
function.
(optional, default NULL
)
A list
containing GLM options
(optional, default NULL
)
A list
containing GBM options
(optional, default NULL
)
A list
containing GAM options
(optional, default NULL
)
A list
containing CTA options
(optional, default NULL
)
A list
containing ANN options
(optional, default NULL
)
A list
containing SRE options
(optional, default NULL
)
A list
containing FDA options
(optional, default NULL
)
A list
containing MARS options
(optional, default NULL
)
A list
containing RF options
(optional, default NULL
)
A list
containing MAXENT options
(optional, default NULL
)
A list
containing XGBOOST options
(glm
)
myFormula
: a typical formula
object (see
Examples).
If not NULL
, type
and interaction.level
parameters are switched off.
You can choose to either :
generate automatically the GLM formula with the following parameters :
type = 'quadratic'
: formula given to the model, must
be simple
, quadratic
or polynomial
interaction.level = 0
: an integer
corresponding
to the interaction level between considered variables considered (be aware that
interactions quickly enlarge the number of effective variables used into the GLM !)
or construct specific formula
test = 'AIC'
: information criteria for the stepwise
selection procedure, must be AIC
(Akaike Information Criteria, BIC
(Bayesian Information Criteria) or none
(consider only the full model,
no stepwise selection, but this can lead to convergence issue and strange results !)
family = binomial(link = 'logit')
: a character
defining the error distribution and link function to be used in the model, mus be a family
name, a family function or the result of a call to a family function (see family)
(so far, biomod2 only runs on presence-absence data, so binomial family is the
default !)
control
: a list
of parameters to control the fitting process (passed to
glm.control
)
(default gbm
)
Please refer to gbm
help file for more details.
distribution = 'bernoulli'
n.trees = 2500
interaction.depth = 7
n.minobsinnode = 5
shrinkage = 0.001
bag.fraction = 0.5
train.fraction = 1
cv.folds = 3
keep.data = FALSE
verbose = FALSE
perf.method = 'cv'
n.cores = 1
(gam
or gam
)
algo = 'GAM_gam'
: a character
defining the chosen GAM function, must
be GAM_gam
(see gam
), GAM_mgcv
(see gam
)
or BAM_mgcv
(see bam
)
myFormula
: a typical formula
object (see
Examples).
If not NULL
, type
and interaction.level
parameters are switched off.
You can choose to either :
generate automatically the GAM formula with the following parameters :
type = 's_smoother'
: the smoother used to generate the formula
interaction.level = 0
: an integer
corresponding
to the interaction level between considered variables considered (be aware that
interactions quickly enlarge the number of effective variables used into the GLM !)
or construct specific formula
k = -1
a smooth term in a formula argument to gam, must be -1
or
4
(see gam s
or mgcv s
)
family = binomial(link = 'logit')
: a character
defining
the error distribution and link function to be used in the model, mus be a family name, a
family function or the result of a call to a family function (see family)
(so far, biomod2 only runs on presence-absence data, so binomial family is the
default !)
control
: a list
of parameters to control the fitting process (passed to
gam.control
or gam.control
)
some options specific to GAM_mgcv
(ignored if algo = 'GAM_gam'
)
method = 'GCV.Cp'
)
optimizer = c('outer','newton')
select = FALSE
knots = NULL
paramPen = NULL
(rpart
)
Please refer to rpart
help file for more details.
method = 'class'
parms = 'default'
: if 'default'
, default rpart
parms
value are kept
cost = NULL
control
: see rpart.control
(nnet
)
NbCV = 5
: an integer
corresponding to the number of cross-validation
repetitions to find best size and decay parameters
size = NULL
: an integer
corresponding to the number of units in the
hidden layer. If NULL
then size parameter will be optimized by cross-validation based
on model AUC (NbCv
cross-validations ; tested size will be the following :
c(2, 4, 6, 8)
). It is also possible to give a vector
of size values to be tested,
and the one giving the best model AUC will be kept.
decay = NULL
: a numeric
corresponding to weight decay. If NULL
then decay parameter will be optimized by cross-validation based on model AUC (NbCv
cross-validations ; tested size will be the following : c(0.001, 0.01, 0.05, 0.1)
).
It is also possible to give a vector
of decay values to be tested, and the one giving
the best model AUC will be kept.
rang = 0.1
: a numeric
corresponding to the initial random weights on
[-rang, rang]
maxit = 200
: an integer
corresponding to the maximum number of
iterations
(bm_SRE
)
quant = 0.025
: a numeric
corresponding to the quantile of
'extreme environmental variable' removed to select species envelops
(fda
)
Please refer to fda
help file for more details.
method = 'mars'
add_args = NULL
: a list
of additional parameters to method
and
given to the ...
options of fda
function
(earth
)
Please refer to earth
help file for more details.
myFormula
: a typical formula
object (see
Examples).
If not NULL
, type
and interaction.level
parameters are switched off.
You can choose to either :
generate automatically the MARS formula with the following parameters :
type = 'simple'
: formula given to the model, must
be simple
, quadratic
or polynomial
interaction.level = 0
: an integer
corresponding
to the interaction level between considered variables considered (be aware that
interactions quickly enlarge the number of effective variables used into the MARS !)
or construct specific formula
nk = NULL
: an integer
corresponding to the maximum number of model
terms.
If NULL
default MARS function value is used : max(21, 2 * nb_expl_var + 1)
penalty = 2
thresh = 0.001
nprune = NULL
pmethod = 'backward'
do.classif = TRUE
: if TRUE
random.forest classification will
be computed, otherwise random.forest regression will be done
ntree = 500
mtry = 'default'
sampsize = NULL
nodesize = 5
maxnodes = NULL
(https://biodiversityinformatics.amnh.org/open_source/maxent/)
path_to_maxent.jar = getwd()
: a character
corresponding to maxent.jar file link
memory_allocated = 512
: an integer
corresponding to
the amount of memory (in Mo) reserved for java
to run
MAXENT
, must be 64
, 128
, 256
,
512
, 1024
... or NULL
to use default java
memory limitation parameter
initial_heap_size = NULL
: a character
initial heap
space (shared memory space) allocated to java. Argument transmitted to
-Xms
when calling java. Used in BIOMOD_Projection
but
not in BIOMOD_Modeling
. Values can be 1024K
,
4096M
, 10G
... or NULL
to use default java
parameter
max_heap_size = NULL
: a character
initial heap
space (shared memory space) allocated to java. Argument transmitted to
-Xmx
when calling java. Used in BIOMOD_Projection
but
not in BIOMOD_Modeling
. Must be larger than
initial_heap_size
. Values can be 1024K
, 4096M
,
10G
... or NULL
to use default java
parameter
background_data_dir
: a character
corresponding to
directory path where explanatory variables are stored as ASCII
files (raster format). If specified, MAXENT
will generate
its own background data from explanatory variables rasters (as usually
done in MAXENT
studies). Otherwise biomod2 pseudo-absences
will be used (see BIOMOD_FormatingData
)
maximumbackground
: an integer
corresponding to the
maximum number of background data to sample if the
background_data_dir
parameter has been set
maximumiterations = 200
: an integer
corresponding
to the maximum number of iterations to do
visible = FALSE
: a logical
to make the
MAXENT
user interface available
linear = TRUE
: a logical
to allow linear features
to be used
quadratic = TRUE
: a logical
to allow
quadratic features to be used
product = TRUE
: a logical
to allow product features
to be used
threshold = TRUE
: a logical
to allow threshold
features to be used
hinge = TRUE
: a logical
to allow hinge features to
be used
lq2lqptthreshold = 80
: an integer
corresponding to the number of samples at which product and threshold
features start being used
l2lqthreshold = 10
: an
integer
corresponding to the number of samples at which quadratic
features start being used
hingethreshold = 15
: an
integer
corresponding to the number of samples at which hinge
features start being used
beta_threshold = -1.0
: a
numeric
corresponding to the regularization parameter to be applied
to all threshold features (negative value enables automatic
setting)
beta_categorical = -1.0
: a numeric
corresponding to the regularization parameter to be applied to all
categorical features (negative value enables automatic setting)
beta_lqp = -1.0
: a numeric
corresponding to the
regularization parameter to be applied to all linear, quadratic and
product features (negative value enables automatic setting)
beta_hinge = -1.0
: a numeric
corresponding to the
regularization parameter to be applied to all hinge features
(negative value enables automatic setting)
betamultiplier = 1
: a numeric
to multiply all
automatic regularization parameters
(higher number gives a more
spread-out distribution)
defaultprevalence = 0.5
: a
numeric
corresponding to the default prevalence of the species
(probability of presence at ordinary occurrence points)
(default xgboost
)
Please refer to xgboost
help file for more details.
max.depth = 5
eta = 0.1
nrounds = 512
objective = "binary:logistic"
nthread = 1
Damien Georges, Wilfried Thuiller
This function allows advanced user to change some default parameters of biomod2 inner
models.
10 single models are available within the package, and their options can be set
with this function through list
objects.
The bm_DefaultModelingOptions
function prints all default parameter values for
all available models.
This output can be copied and pasted to be used as is (with wanted
changes) as function arguments (see Examples).
Below is the detailed list of all modifiable parameters for each available model.
BIOMOD_Tuning
, BIOMOD_Modeling
Other Main functions:
BIOMOD_EnsembleForecasting()
,
BIOMOD_EnsembleModeling()
,
BIOMOD_FormatingData()
,
BIOMOD_LoadModels()
,
BIOMOD_Modeling()
,
BIOMOD_PresenceOnly()
,
BIOMOD_Projection()
,
BIOMOD_RangeSize()
,
BIOMOD_Tuning()
library(terra)
# Load species occurrences (6 species available)
data(DataSpecies)
head(DataSpecies)
# Select the name of the studied species
myRespName <- 'GuloGulo'
# Get corresponding presence/absence data
myResp <- as.numeric(DataSpecies[, myRespName])
# Get corresponding XY coordinates
myRespXY <- DataSpecies[, c('X_WGS84', 'Y_WGS84')]
# Load environmental variables extracted from BIOCLIM (bio_3, bio_4, bio_7, bio_11 & bio_12)
data(bioclim_current)
myExpl <- terra::rast(bioclim_current)
# \dontshow{
myExtent <- terra::ext(0,30,45,70)
myExpl <- terra::crop(myExpl, myExtent)
# }
# ---------------------------------------------------------------#
# Print default modeling options
bm_DefaultModelingOptions()
# Create default modeling options
myBiomodOptions <- BIOMOD_ModelingOptions()
myBiomodOptions
# # Part (or totality) of the print can be copied and customized
# # Below is an example to compute quadratic GLM and select best model with 'BIC' criterium
# myBiomodOptions <- BIOMOD_ModelingOptions(
# GLM = list(type = 'quadratic',
# interaction.level = 0,
# myFormula = NULL,
# test = 'BIC',
# family = 'binomial',
# control = glm.control(epsilon = 1e-08,
# maxit = 1000,
# trace = FALSE)))
# myBiomodOptions
#
# # It is also possible to give a specific GLM formula
# myForm <- 'Sp277 ~ bio3 + log(bio10) + poly(bio16, 2) + bio19 + bio3:bio19'
# myBiomodOptions <- BIOMOD_ModelingOptions(GLM = list(myFormula = formula(myForm)))
# myBiomodOptions
Run the code above in your browser using DataLab