This function creates a DataSplitTable which could be used to evaluate models in Biomod with repeated k-fold cross-validation (cv) or stratified cv instead of repeated split sample runs
BIOMOD_cv(
data,
k = 5,
repetition = 5,
do.full.models = TRUE,
stratified.cv = FALSE,
stratify = "both",
balance = "pres"
)
DataSplitTable matrix with k*repetition (+ 1 for Full models if do.full.models = TRUE) columns for BIOMOD_Modeling function. Stratification "x" and "y" was described in Wenger and Olden 2012. While Stratification "y" uses k partitions along the y-gradient, "x" does the same for the x-gradient and "both" combines them. Stratification "block" was described in Muscarella et al. 2014. For bins of equal number are partitioned (bottom-left, bottom-right, top-left and top-right).
BIOMOD.formated.data object returned by BIOMOD_FormatingData
number of bins/partitions for k-fold cv
number of repetitions of k-fold cv (1 if stratified.cv=TRUE)
if true, models calibrated and evaluated with the whole dataset are done
logical. run a stratified cv
stratification method of the cv. Could be "x", "y", "both" (default), "block" or the name of a predictor for environmental stratified cv.
make balanced particions for "presences" (default) or "absences" (resp. pseudo-absences or background).
Frank Breiner frank.breiner@wsl.ch
Stratified cv could be used to test for model overfitting and for assessing transferability in geographic and environmental space. If balance = "presences" presences are divided (balanced) equally over the particions (e.g. Fig. 1b in Muscarelly et al. 2014). Pseudo-Absences will however be unbalanced over the particions especially if the presences are clumped on an edge of the study area. If balance = "absences" absences (resp. Pseudo-Absences or background) are divided (balanced) as equally as possible for the particions (geographical balanced bins given that absences are spread over the study area equally, approach similar to Fig. 1 in Wenger et Olden 2012). Presences will however be unbalanced over the particians. Be careful: If the presences are clumped on an edge of the study area it is possible that all presences are in one bin.
Muscarella, R., Galante, P.J., Soley-Guardia, M., Boria, R.A., Kass, J.M., Uriarte, M. & Anderson, R.P. (2014). ENMeval: An R package for conducting spatially independent evaluations and estimating optimal model complexity for Maxent ecological niche models. Methods in Ecology and Evolution, 5, 1198-1205. Wenger, S.J. & Olden, J.D. (2012). Assessing transferability of ecological models: an underappreciated aspect of statistical validation. Methods in Ecology and Evolution, 3, 260-267.
get.block