Learn R Programming

biomod2 (version 4.2-5)

bm_FindOptimStat: Calculate the best score according to a given evaluation method

Description

This internal biomod2 function allows the user to find the threshold to convert continuous values into binary ones leading to the best score for a given evaluation metric.

Usage

bm_FindOptimStat(
  metric.eval = "TSS",
  obs,
  fit,
  nb.thresh = 100,
  threshold = NULL,
  boyce.bg.env = NULL,
  mpa.perc = 0.9
)

get_optim_value(metric.eval)

bm_CalculateStat(misc, metric.eval = "TSS")

Value

A 1 row x 5 columns data.frame containing :

  • metric.eval : the chosen evaluation metric

  • cutoff : the associated cut-off used to transform the continuous values into binary

  • sensitivity : the sensibility obtained on fitted values with this threshold

  • specificity : the specificity obtained on fitted values with this threshold

  • best.stat : the best score obtained for the chosen evaluation metric

Arguments

metric.eval

a character corresponding to the evaluation metric to be used, must be either POD, FAR, POFD, SR, ACCURACY, BIAS, ROC, TSS, KAPPA, OR, ORSS, CSI, ETS, BOYCE, MPA

obs

a vector of observed values (binary, 0 or 1)

fit

a vector of fitted values (continuous)

nb.thresh

an integer corresponding to the number of thresholds to be tested over the range of fitted values

threshold

(optional, default NULL)
A numeric corresponding to the threshold used to convert the given data

boyce.bg.env

(optional, default NULL)
A matrix, data.frame, SpatVector or SpatRaster object containing values of environmental variables (in columns or layers) extracted from the background (if presences are to be compared to background instead of absences or pseudo-absences selected for modeling)
Note that old format from raster and sp are still supported such as RasterStack and SpatialPointsDataFrame objects.

mpa.perc

a numeric between 0 and 1 corresponding to the percentage of correctly classified presences for Minimal Predicted Area (see ecospat.mpa() in ecospat)

misc

a matrix corresponding to a contingency table

Author

Damien Georges

Details

simple

  • POD : Probability of detection (hit rate)

  • FAR : False alarm ratio

  • POFD : Probability of false detection (fall-out)

  • SR : Success ratio

  • ACCURACY : Accuracy (fraction correct)

  • BIAS : Bias score (frequency bias)

complex

  • ROC : Relative operating characteristic

  • TSS : True skill statistic (Hanssen and Kuipers discriminant, Peirce's skill score)

  • KAPPA : Cohen's Kappa (Heidke skill score)

  • OR : Odds Ratio

  • ORSS : Odds ratio skill score (Yule's Q)

  • CSI : Critical success index (threat score)

  • ETS : Equitable threat score (Gilbert skill score)

presence-only

  • BOYCE : Boyce index

  • MPA : Minimal predicted area (cutoff optimising MPA to predict 90% of presences)

Optimal value of each method can be obtained with the get_optim_value function.
Please refer to the CAWRC website (section "Methods for dichotomous forecasts") to get detailed description of each metric.

Note that if a value is given to threshold, no optimisation will be done., and only the score for this threshold will be returned.

The Boyce index returns NA values for SRE models because it can not be calculated with binary predictions.
This is also the reason why some NA values might appear for GLM models if they do not converge.

References

  • Engler, R., Guisan, A., and Rechsteiner L. 2004. An improved approach for predicting the distribution of rare and endangered species from occurrence and pseudo-absence data. Journal of Applied Ecology, 41(2), 263-274.

  • Hirzel, A. H., Le Lay, G., Helfer, V., Randin, C., and Guisan, A. 2006. Evaluating the ability of habitat suitability models to predict species presences. Ecological Modelling, 199(2), 142-152.

See Also

ecospat.boyce() and ecospat.mpa() in ecospat, BIOMOD_Modeling, bm_RunModelsLoop, BIOMOD_EnsembleModeling

Other Secundary functions: bm_BinaryTransformation(), bm_CrossValidation(), bm_MakeFormula(), bm_ModelingOptions(), bm_PlotEvalBoxplot(), bm_PlotEvalMean(), bm_PlotRangeSize(), bm_PlotResponseCurves(), bm_PlotVarImpBoxplot(), bm_PseudoAbsences(), bm_RunModelsLoop(), bm_SRE(), bm_SampleBinaryVector(), bm_SampleFactorLevels(), bm_Tuning(), bm_VariablesImportance()

Examples

Run this code
## Generate a binary vector
vec.a <- sample(c(0, 1), 100, replace = TRUE)

## Generate a 0-1000 vector (random drawing)
vec.b <- runif(100, min = 0, max = 1000)

## Generate a 0-1000 vector (biased drawing)
BiasedDrawing <- function(x, m1 = 300, sd1 = 200, m2 = 700, sd2 = 200) {
  return(ifelse(x < 0.5, rnorm(1, m1, sd1), rnorm(1, m2, sd2)))
}
vec.c <- sapply(vec.a, BiasedDrawing)
vec.c[which(vec.c < 0)] <- 0
vec.c[which(vec.c > 1000)] <- 1000

## Find optimal threshold for a specific evaluation metric
bm_FindOptimStat(metric.eval = 'TSS', fit = vec.b, obs = vec.a)
bm_FindOptimStat(metric.eval = 'TSS', fit = vec.c, obs = vec.a, nb.thresh = 100)
bm_FindOptimStat(metric.eval = 'TSS', fit = vec.c, obs = vec.a, threshold = 280)


Run the code above in your browser using DataLab