Dataset classAll datasets that represent a map from keys to data samples should subclass this
class. All subclasses should overwrite the .getitem() method, which supports
fetching a data sample for a given key. Subclasses could also optionally
overwrite .length(), which is expected to return the size of the dataset
(e.g. number of samples) used by many sampler implementations
and the default options of dataloader().
dataset(
name = NULL,
inherit = Dataset,
...,
private = NULL,
active = NULL,
parent_env = parent.frame()
)a name for the dataset. It it's also used as the class for it.
you can optionally inherit from a dataset when creating a new dataset.
public methods for the dataset class
passed to R6::R6Class().
passed to R6::R6Class().
An environment to use as the parent of newly-created objects.
By default datasets are iterated by returning each observation/item individually.
Sometimes it's possible to have an optimized implementation to take a batch
of observations (eg, subsetting a tensor by multiple indexes at once is faster than
subsetting once for each index), in this case you can implement a .getbatch method
that will be used instead of .getitem when getting a batch of observations within
the dataloader.