tfdatasets package - RDocumentation

Learn R Programming

R interface to TensorFlow Dataset API

The TensorFlow Dataset API provides various facilities for creating scalable input pipelines for TensorFlow models, including:

Reading data from a variety of formats including CSV files and TFRecords files (the standard binary format for TensorFlow training data).
Transforming datasets in a variety of ways including mapping arbitrary functions against them.
Shuffling, batching, and repeating datasets over a number of epochs.
Streaming interface to data for reading arbitrarily large datasets.
Reading and transforming data are TensorFlow graph operations, so are executed in C++ and in parallel with model training.

The R interface to TensorFlow datasets provides access to the Dataset API, including high-level convenience functions for easy integration with the keras package.

For documentation on using tfdatasets, see the package website at https://tensorflow.rstudio.com/tools/tfdatasets/.

Copy Link

Version

Install

install.packages('tfdatasets')

Monthly Downloads

1,580

Version

2.17.0

License

Apache License 2.0

Issues

Pull Requests

Stars

Forks

Repository

https://github.com/rstudio/tfdatasets

Maintainer

Tomasz Kalinowski

Last Published

July 16th, 2024

Functions in tfdatasets (2.17.0)

as_array_iterator

Convert tf_dataset to an iterator that yields R arrays.

Add the tf_dataset class to a dataset

Find all nominal variables.

as_tensor.tensorflow.python.data.ops.dataset_ops.DatasetV2

Get the single element of the dataset.

choose_from_datasets

Creates a dataset that deterministically chooses elements from datasets.

Caches the elements in this dataset.

Speciy all numeric variables.

dataset_interleave

Maps map_func across this dataset, and interleaves the results

dataset_enumerate

Enumerates the elements of this dataset

A transformation that discards duplicate elements of a Dataset.

Map a function across a dataset.

dataset_concatenate

Creates a dataset by concatenating given dataset with this dataset.

dataset_collect

Collects a dataset

dataset_use_spec

Transform the dataset using the provided spec.

dataset_decode_delim

Transform a dataset with delimted text lines into a dataset with named columns

dataset_padded_batch

Combines consecutive elements of this dataset into padded batches.

dataset_take_while

A transformation that stops dataset iteration based on a predicate.

dataset_prefetch

Creates a Dataset that prefetches elements from this dataset.

Heart Disease Data Set

dataset_map_and_batch

Fused implementation of dataset_map() and dataset_batch()

dataset_unbatch

Unbatch a dataset

input_fn.tf_dataset

Construct a tfestimators input function from a dataset

layer_input_from_dataset

Creates a list of inputs from a dataset

file_list_dataset

A dataset of all files matching a pattern

fit.FeatureSpec

Fits a feature specification.

dataset_prefetch_to_device

A transformation that prefetches dataset values to the given device

length.tf_dataset

Get Dataset length

Objects exported from other packages

dataset_options

Get or Set Dataset Options

dataset_prepare

Prepare a dataset for analysis

Combines input elements into a dataset of windows.

delim_record_spec

Specification for reading a record from a text file with delimited values

Creates a dataset that includes only 1 / num_shards of this dataset.

Combines consecutive elements of this dataset into batches.

Filter a dataset by a predicate

sample_from_datasets

Samples elements at random from the datasets in datasets.

Reduces the input dataset to a single element.

dataset_rejection_resample

A transformation that resamples a dataset to a target distribution.

dataset_snapshot

Persist the output of a dataset

Creates a dataset with at most count elements from this dataset

dataset_shuffle

Randomly shuffles the elements of this dataset.

step_embedding_column

Creates embeddings columns

dataset_flat_map

Maps map_func across this dataset and flattens the result.

dataset_bucket_by_sequence_length

A transformation that buckets elements in a Dataset by length

dataset_group_by_window

Group windows of elements by key and reduce them

step_indicator_column

Creates Indicator Columns

Repeats a dataset count times.

until_out_of_range

Execute code that traverses a dataset until an out of range condition occurs

iterator_get_next

Get next element from iterator

Execute code that traverses a dataset

A transformation that scans a function across an input dataset

dataset_shuffle_and_repeat

Shuffles and repeats a dataset returning a new permutation for each epoch.

fixed_length_record_dataset

A dataset of fixed-length records from one or more binary files.

Creates a dataset that skips count elements from this dataset

iterator_initializer

An operation that should be run to initialize this iterator.

scaler_standard

Creates an instance of a standard scaler

Identify the type of the variable.

Creates an iterator for enumerating the elements of this dataset.

Creates a feature specification.

Tensor(s) for retrieving the next batch from a dataset

random_integer_dataset

Creates a Dataset of pseudorandom values

make_csv_dataset

Reads CSV files into a batched dataset

List of pre-made scalers

step_numeric_column

Creates a numeric column specification

Creates an instance of a min max scaler

sparse_tensor_slices_dataset

Splits each rank-N tf$SparseTensor in this dataset row-wise.

iterator_make_initializer

Create an operation that can be run to initialize this iterator

step_categorical_column_with_identity

Create a categorical column with identity

step_remove_column

Creates a step that can remove columns

sql_record_spec

A dataset consisting of the results from a SQL query

step_categorical_column_with_vocabulary_list

Creates a categorical column specification

iterator_string_handle

String-valued tensor that represents this iterator

step_crossed_column

Creates crosses of categorical columns

text_line_dataset

A dataset comprising lines from one or more text files.

Creates a dataset of a step-separated range of values.

Read a dataset from a set of files

Output types and shapes

tensor_slices_dataset

Creates a dataset whose elements are slices of the given tensors.

tfrecord_dataset

A dataset comprising records from one or more TFRecord files.

step_categorical_column_with_vocabulary_file

Creates a categorical column with vocabulary file

Creates a dataset by zipping together the given datasets.

step_bucketized_column

Creates bucketized columns

step_categorical_column_with_hash_bucket

Creates a categorical column with hash buckets specification

tensors_dataset

Creates a dataset with a single element, comprising the given tensors.

Steps for feature columns specification.

step_shared_embeddings_column

Creates shared embeddings for categorical columns