Learn R Programming

CAST: Caret Applications for Spatio-Temporal models

Supporting functionality to run 'caret' with spatial or spatial-temporal data. 'caret' is a frequently used package for model training and prediction using machine learning. CAST includes functions to improve spatial or spatial-temporal modelling tasks using 'caret'. To decrease spatial overfitting and to improve model performances, the package implements a forward feature selection that selects suitable predictor variables in view to their contribution to spatial or spatio-temporal model performance. CAST further includes functionality to estimate the (spatial) area of applicability of prediction models.

Note: The developer version of CAST can be found on https://github.com/HannaMeyer/CAST. The CRAN Version can be found on https://CRAN.R-project.org/package=CAST

The figure shows a very simple workflow for a spatial prediction mapping workflow, indicating which function in CAST can be used in the different steps to support the spatial prediction.

Package Website

https://hannameyer.github.io/CAST/

Tutorials

https://www.youtube.com/watch?v=mkHlmYEzsVQ.

Scientific documentation of the methods

Spatial cross-validation

  • Milà, C., Mateu, J., Pebesma, E., Meyer, H. (2022): Nearest Neighbour Distance Matching Leave-One-Out Cross-Validation for map validation. Methods in Ecology and Evolution 00, 1– 13.

https://doi.org/10.1111/2041-210X.13851

  • Linnenbrink, J., Milà, C., Ludwig, M., and Meyer, H.: kNNDM (2023): k-fold Nearest Neighbour Distance Matching Cross-Validation for map accuracy estimation. EGUsphere [preprint].

https://doi.org/10.5194/egusphere-2023-1308

  • Meyer, H., Reudenbach, C., Hengl, T., Katurji, M., Nauss, T. (2018): Improving performance of spatio-temporal machine learning models using forward feature selection and target-oriented validation. Environmental Modelling & Software, 101, 1-9. https://doi.org/10.1016/j.envsoft.2017.12.001

Spatial variable selection

  • Meyer, H., Reudenbach, C., Hengl, T., Katurji, M., Nauss, T. (2018): Improving performance of spatio-temporal machine learning models using forward feature selection and target-oriented validation. Environmental Modelling & Software, 101, 1-9. https://doi.org/10.1016/j.envsoft.2017.12.001

  • Meyer, H., Reudenbach, C., Wöllauer, S., Nauss, T. (2019): Importance of spatial predictor variable selection in machine learning applications - Moving from data reproduction to spatial prediction. Ecological Modelling. 411. https://doi.org/10.1016/j.ecolmodel.2019.108815

Area of applicability

  • Meyer, H., Pebesma, E. (2021). Predicting into unknown space? Estimating the area of applicability of spatial prediction models. Methods in Ecology and Evolution, 12, 1620– 1633. https://doi.org/10.1111/2041-210X.13650

  • Schumacher, F., Knoth, C., Ludwig, M., Meyer, H. (2024): Estimation of local training data point densities to support the assessment of spatial prediction uncertainty. EGUsphere. https://doi.org/10.5194/egusphere-2024-2730.

Applications and use cases

  • Meyer, H., Pebesma, E. (2022): Machine learning-based global maps of ecological variables and the challenge of assessing them. Nature Communications, 13. https://www.nature.com/articles/s41467-022-29838-9

  • Ludwig, M., Moreno-Martinez, A., Hoelzel, N., Pebesma, E., Meyer, H. (2023): Assessing and improving the transferability of current global spatial prediction models. Global Ecology and Biogeography. https://doi.org/10.1111/geb.13635.

  • Milà, C., Ludwig, M., Pebesma, E., Tonne, C., and Meyer, H.: Random forests with spatial proxies for environmental modelling: opportunities and pitfalls, EGUsphere [preprint]. https://doi.org/10.5194/egusphere-2024-138, 2024.

Copy Link

Version

Install

install.packages('CAST')

Monthly Downloads

950

Version

1.0.3

License

GPL (>= 2)

Issues

Pull Requests

Stars

Forks

Maintainer

Hanna Meyer

Last Published

January 9th, 2025

Functions in CAST (1.0.3)

knndm

K-fold Nearest Neighbour Distance Matching
print

Print CAST classes
plot

Plot CAST classes
splotdata

sPlotOpen Data of Species Richness
normalize_DI

Normalize DI values
trainDI

Calculate Dissimilarity Index of training data
nndm

Nearest Neighbour Distance Matching (NNDM) algorithm
ffs

Forward feature selection
global_validation

Evaluate 'global' cross-validation
aoa

Area of Applicability
clustered_sample

Clustered samples simulation
CAST

'caret' Applications for Spatial-Temporal Models
errorProfiles

Model and inspect the relationship between the prediction error and measures of dissimilarities and distances
bss

Best subset feature selection
CreateSpacetimeFolds

Create Space-time Folds
geodist

Calculate euclidean nearest neighbor distances in geographic space or feature space
cookfarm

Cookfarm soil logger data