Learn R Programming

tidysdm

The goal of tidysdm is to implement Species Distribution Models using the tidymodels framework. The advantage of tidymodels is that the model syntax and the results returned to the user are standardised, thus providing a coherent interface to modelling. Given the variety of models required for SDM, tidymodels is an ideal framework. tidysdm provides a number of wrappers and specialised functions to facilitate the fitting of SDM with tidymodels.

Besides modelling contemporary species, tidysdm has a number of functions specifically designed to work with palaeontological data.

Whilst users are free to use their own environmental data, the articles showcase the potential integration with pastclim, which helps downloading and manipulating present day data, future predictions, and palaeoclimate reconstructions.

An overview of the capabilities of tidysdm is given in Leonardi et al. (2023).

Installation

tidysdm is on CRAN, and the easiest way to install it is with:

install.packages("tidysdm")

The version on CRAN is recommended for every day use. New features and bug fixes appear first on the dev branch on GitHub, before they make their way to CRAN. If you need to have early access to these new features, you can install the latest dev version of tidysdm from r-universe with:

install.packages("tidysdm", repos = c("https://evolecolgroup.r-universe.dev", 
                                      "https://cloud.r-project.org"))

Alternatively, you can also use devtools and install the package from source directly from GitHub, but you might need to set up your development environment first:

# install.packages("devtools") # if you haven't installed devtools yet
devtools::install_github("EvolEcolGroup/tidysdm", ref = "dev")

Overview of functionality

On its dedicated website, you can find Articles giving you a step-by-step overview of the fitting SDMs to contemporary species, as well as an equivalent tutorial for using palaeontological data. Furthermore, there is an Article with examples of how to leverage various features of tidymodels that are not commonly adopted in SDM pipelines

There is also a dev version of the site updated for the dev branch of tidysdm (on the top left of the dev website, the version number is in red and in the format x.x.x.9xxx, indicating it is a development version). If you want to contribute, make sure to read our contributing guide.


Getting help

If some of your models are failing, first look at our Article on how to diagnose failing models. It is not a fully comprehensive list of everything that could go wrong, but it will hopefully give you ideas on how to dig deeper in what is wrong.

If after reading the article you are still unsure what is going wrong, there are two places to get help with tidysdm:

  1. If you are unsure how to do something, go to StackOverflow and,

after checking that a similar question has not been asked yet, tag your question with tidymodels and r (there is no tidysdm tag yet, as there aren't enough questions), and make sure tidysdm is in the title. This will ensure that the developers see your question and can help you. If you have not received an answer after a couple of days, feel free to drop us an email in case we missed your post.

  1. If you think you have found a bug, or have a feature request, open an issue on our

[GitHub repository]((https://github.com/EvolEcolGroup/tidysdm/issues). Before doing so, please make sure that you have installed the latest development version of tidysdm (as the bug might have already been fixed), as well as updating all other packages on your system. If the problem persists, and there is no issue already opened that deals with your bug, file a new issue providing a reproducible example for the developers to investigate the problem. A small reproducible example is crucial in allowing us to understand the problem and fix it, so please do your best to come up with the shortest bit of code needed to show the bug.

Copy Link

Version

Install

install.packages('tidysdm')

Monthly Downloads

561

Version

1.0.0

License

AGPL (>= 3)

Issues

Pull Requests

Stars

Forks

Maintainer

Andrea Manica

Last Published

March 5th, 2025

Functions in tidysdm (1.0.0)

explain_tidysdm

Create explainer from your tidysdm ensembles.
dist_pres_vs_bg

Distance between the distribution of climate values for presences vs background
conf_matrix_df

Make a confusion matrix dataframe for multiple thresholds
autoplot.spatial_initial_split

Create a ggplot for a spatial initial rsplit.
filter_collinear

Filter to retain only variables that have low collinearity
extrapol_mess

Multivariate environmental similarity surfaces (MESS)
gam_formula

Create a formula for gam
control_ensemble_grid

Control wrappers
geom_split_violin

Split violin geometry for ggplots
grid_offset

Get default grid cellsize for a given dataset
grid_cellsize

Get default grid cellsize for a given dataset
maxnet_fit

Wrapper to fit maxnet models with formulae
maxnet_predict

Wrapper to predict maxnet models
lacertidae_background

Coordinates of presences for lacertidae in the Iberian peninsula
collect_metrics.simple_ensemble

Obtain and format results produced by tuning functions for ensemble objects
optim_thresh_kap_max

Find threshold that maximises Kappa
clamp_predictors

Clamp the predictors to match values in training set
pairs,stars-method

Pairwise matrix of scatterplot for stars objects
make_mask_from_presence

Make a mask from presence data
maxent_params

Parameters for maxent models
kap_max

Maximum Cohen's Kappa
maxent

MaxEnt model
horses

Coordinates of radiocarbon dates for horses
filter_high_cor

Deprecated: Filter to retain only variables below a given correlation threshold
form_resp

Get the response variable from a formula
lacerta_ensemble

A simple ensemble for the lacerta data
%>%

Pipe operator
niche_overlap

Compute overlap metrics of the two niches
km2m

Convert a geographic distance from km to m
optim_thresh

Find threshold that optimises a given metric
lacerta

Coordinates of presences for Iberian emerald lizard
sdm_spec_gam

Model specification for a GAM for SDM
sdm_spec_glm

Model specification for a GLM for SDM
optim_thresh_sens

Find threshold that gives a target sensitivity
tidysdm-package

tidysdm: Species Distribution Models with Tidymodels
repeat_ensemble

Repeat ensemble
recipe.sf

Recipe for sf objects
predict.simple_ensemble

Predict for a simple ensemble set
tss

TSS - True Skill Statistics
predict_raster

Make predictions for a whole raster
optim_thresh_tss_max

Find threshold that maximises TSS
tss_max

Maximum TSS - True Skill Statistics
simple_ensemble

Simple ensemble
spatial_initial_split

Simple Training/Test Set Splitting for spatial data
out_of_range_warning

Warn if some times are outside the range of time steps from a raster
y2d

Convert a time interval from years to days
thin_by_dist

Thin points dataset based on geographic distance
thin_by_dist_time

Thin points dataset based on geographic and temporal distance
sdm_metric_set

Metric set for SDM
sdm_spec_boost_tree

Model specification for a Boosted Trees model for SDM
prob_metrics_sf

Probability metrics for sf objects
prob_to_binary

simple function to convert probability to binary classes
thin_by_cell

Thin point dataset to have 1 observation per raster cell
lacerta_rep_ens

A repeat ensemble for the lacerta data
plot_pres_vs_bg

Plot presences vs background
sample_background_time

Sample background points for SDM analysis for points with a time point.
sdm_spec_maxent

Model specification for a MaxEnt for SDM
sdm_spec_rand_forest

Model specification for a Random Forest for SDM
sample_background

Sample background points for SDM analysis
thin_by_cell_time

Thin point dataset to have 1 observation per raster cell per time slice
predict.repeat_ensemble

Predict for a repeat ensemble set
sample_pseudoabs

Sample pseudo-absence points for SDM analysis
sample_pseudoabs_time

Sample pseudo-absence points for SDM analysis for points with a time point.
calib_class_thresh

Calibrate class thresholds
blockcv2rsample

Convert an object created with blockCV to an rsample object
boyce_cont

Boyce continuous index (BCI)
check_sdm_presence

Check that the column with presences is correctly formatted
check_splits_balance

Check the balance of presences vs pseudoabsences among splits
add_member

Add best member of workflow to a simple ensemble
autoplot.simple_ensemble

Plot the results of a simple ensemble
add_repeat

Add repeat(s) to a repeated ensemble
check_coords_names

Check that we have a valid pair of coordinate names