Learn R Programming

CAST (version 1.0.3)

CreateSpacetimeFolds: Create Space-time Folds

Description

Create spatial, temporal or spatio-temporal Folds for cross validation based on pre-defined groups

Usage

CreateSpacetimeFolds(
  x,
  spacevar = NA,
  timevar = NA,
  k = 10,
  class = NA,
  seed = sample(1:1000, 1)
)

Value

A list that contains a list for model training and a list for model validation that can directly be used as "index" and "indexOut" in caret's trainControl function

Arguments

x

data.frame containing spatio-temporal data

spacevar

Character indicating which column of x identifies the spatial units (e.g. ID of weather stations)

timevar

Character indicating which column of x identifies the temporal units (e.g. the day of the year)

k

numeric. Number of folds. If spacevar or timevar is NA and a leave one location out or leave one time step out cv should be performed, set k to the number of unique spatial or temporal units.

class

Character indicating which column of x identifies a class unit (e.g. land cover)

seed

numeric. See ?seed

Author

Hanna Meyer

Details

The function creates train and test sets by taking (spatial and/or temporal) groups into account. In contrast to nndm, it requires that the groups are already defined (e.g. spatial clusters or blocks or temporal units). Using "class" is helpful in the case that data are clustered in space and are categorical. E.g This is the case for land cover classifications when training data come as training polygons. In this case the data should be split in a way that entire polygons are held back (spacevar="polygonID") but at the same time the distribution of classes should be similar in each fold (class="LUC").

References

Meyer, H., Reudenbach, C., Hengl, T., Katurji, M., Nauß, T. (2018): Improving performance of spatio-temporal machine learning models using forward feature selection and target-oriented validation. Environmental Modelling & Software 101: 1-9.

See Also

trainControl,ffs, nndm, geodist

Examples

Run this code
if (FALSE) {
data(cookfarm)
### Prepare for 10-fold Leave-Location-and-Time-Out cross validation
indices <- CreateSpacetimeFolds(cookfarm,"SOURCEID","Date")
str(indices)
### Prepare for 10-fold Leave-Location-Out cross validation
indices <- CreateSpacetimeFolds(dat,spacevar="SOURCEID")
str(indices)
### Prepare for leave-One-Location-Out cross validation
indices <- CreateSpacetimeFolds(dat,spacevar="SOURCEID",
    k=length(unique(dat$SOURCEID)))
str(indices)
}

Run the code above in your browser using DataLab