Learn R Programming

R interface to useful extra functionality for TensorFlow 2.x by SIG-addons

The tfaddons package provides R wrappers to TensorFlow Addons.

TensorFlow Addons is a repository of contributions that conform to well-established API patterns, but implement new functionality not available in core TensorFlow. TensorFlow natively supports a large number of operators, layers, metrics, losses, and optimizers. However, in a fast moving field like ML, there are many interesting new developments that cannot be integrated into core TensorFlow (because their broad applicability is not yet clear, or it is mostly used by a smaller subset of the community).

Addons provide the following features which are compatible with keras library.

  • activations
  • callbacks
  • image
  • layers
  • losses
  • metrics
  • optimizers
  • rnn
  • seq2seq
  • text

Installation

Requirements:

  • TensorFlow 2.X

The dev version:

devtools::install_github('henry090/tfaddons')

Later, you need to install the python module tensorflow-addons:

tfaddons::install_tfaddons()

Usage: the basics

Here's how to build a sequential model with keras using additional features from tfaddons package.

Import and prepare MNIST dataset.

library(keras)
library(tfaddons)

mnist = dataset_mnist()

x_train <- mnist$train$x
y_train <- mnist$train$y

# reshape the dataset
x_train <- array_reshape(x_train, c(nrow(x_train), 28, 28, 1))

# Transform RGB values into [0,1] range
x_train <- x_train / 255

y_train <- to_categorical(y_train, 10)

Using the Sequential API, define the model architecture.

# Build a sequential model
model = keras_model_sequential() %>% 
  layer_conv_2d(filters = 10, kernel_size = c(3,3),input_shape = c(28,28,1),
                #apply activation gelu
                activation = activation_gelu) %>% 
  # apply group normalization layer
  layer_group_normalization(groups = 5, axis = 3) %>% 
  layer_flatten() %>% 
  layer_dense(10, activation='softmax')

# Compile
model %>% compile(
  # apply rectified adam
  optimizer = optimizer_radam(),
  # apply sparse max loss
  loss = loss_sparsemax(),
  # choose cohen kappa metric
  metrics = metric_cohen_kappa(10)
)

Train the Keras model.

model %>% fit(
  x_train, y_train,
  batch_size = 128,
  epochs = 1,
  validation_split = 0.2
)
Train on 48000 samples, validate on 12000 samples
48000/48000 [==============================] - 24s 510us/sample - loss: 0.1193 - cohen_kappa: 0.8074 - 
val_loss: 0.0583 - val_cohen_kappa: 0.9104

Let's apply Weight Normalization, a Simple Reparameterization technique to Accelerate Training of Deep Neural Networks:

Note: We only change the model architecture and then train our model.

# Build a sequential model
model = keras_model_sequential() %>% 
  layer_weight_normalization(input_shape = c(28L,28L,1L),
                             layer_conv_2d(filters = 10, kernel_size = c(3,3))) %>% 
  layer_flatten() %>% 
  layer_weight_normalization(layer_dense(units = 10, activation='softmax'))
Train on 48000 samples, validate on 12000 samples
48000/48000 [==============================] - 12s 253us/sample - loss: 0.1276 - cohen_kappa: 0.7920 - 
val_loss: 0.0646 - val_cohen_kappa: 0.9044

We can see that the training process has finished in 12 seconds. But without this method, 1 epoch required 24 seconds.

Callbacks

One can stop training after certain time. For this purpose, seconds parameter should be set in callback_time_stopping function:

model %>% fit(
  x_train, y_train,
  batch_size = 128,
  epochs = 4,
  validation_split = 0.2,
  verbose = 0,
  callbacks = callback_time_stopping(seconds = 6, verbose = 1)
)
Timed stopping at epoch 1 after training for 0:00:06

Losses

TripletLoss can be applied in the following form:

First task is to create a Keras model.

model = keras_model_sequential() %>% 
  layer_conv_2d(filters = 64, kernel_size = 2, padding='same', input_shape=c(28,28,1)) %>% 
  layer_max_pooling_2d(pool_size=2) %>% 
  layer_flatten() %>% 
  layer_dense(256, activation= NULL) %>% 
  layer_lambda(f = function(x) tf$math$l2_normalize(x, axis = 1L))

model %>% compile(
  optimizer = optimizer_lazy_adam(),
  # apply triplet semihard loss
  loss = loss_triplet_semihard())

With tfdatasets package we can cast our dataset and then fit.

library(tfdatasets)

train = tensor_slices_dataset(list(tf$cast(x_train,'uint8'),tf$cast( y_train,'int64'))) %>% 
  dataset_shuffle(1024) %>% dataset_batch(32)
  
# fit
model %>% fit(
  train,
  epochs = 1
)
Train for 1875 steps
1875/1875 [==============================] - 74s 39ms/step - loss: 0.4227

Copy Link

Version

Install

install.packages('tfaddons')

Monthly Downloads

120

Version

0.10.0

License

Apache License 2.0

Issues

Pull Requests

Stars

Forks

Maintainer

Turgut Abdullayev

Last Published

June 2nd, 2020

Functions in tfaddons (0.10.0)

activation_softshrink

Softshrink
activation_hardshrink

Hardshrink
activation_lisht

Lisht
activation_gelu

Gelu
activation_rrelu

Rrelu
activation_tanhshrink

Tanhshrink
attention_bahdanau

Bahdanau Attention
activation_sparsemax

Sparsemax
attention_bahdanau_monotonic

Bahdanau Monotonic Attention
attention_wrapper_state

Attention Wrapper State
attention_luong

Implements Luong-style (multiplicative) attention scoring.
callback_average_model_checkpoint

Average Model Checkpoint
callback_time_stopping

Time Stopping
crf_decode

CRF decode
img_get_ndims

Get ndims
decode_dynamic

Dynamic decode
decoder_basic_output

Basic decoder output
crf_binary_score

CRF binary score
decoder

An RNN Decoder abstract interface object.
img_transform

Transform
img_interpolate_bilinear

Interpolate bilinear
img_to_4D

To 4D image
attention_luong_monotonic

Monotonic attention mechanism with Luong-style energy function.
crf_forward

CRF forward
crf_decode_backward

CRF decode backward
crf_log_likelihood

CRF log likelihood
decoder_base

Base Decoder
decoder_basic

Basic Decoder
callback_tqdm_progress_bar

TQDM Progress Bar
hardmax

Hardmax
attention_wrapper

Attention Wrapper
layer_instance_normalization

Instance normalization layer
attention_monotonic

Monotonic attention
crf_sequence_score

CRF sequence score
img_equalize

Equalize
crf_log_norm

CRF log norm
extend_with_decoupled_weight_decay

Factory function returning an optimizer class with decoupled weight decay
img_euclidean_dist_transform

Euclidean dist transform
decoder_final_beam_search_output

Final Beam Search Decoder Output
crf_multitag_sequence_score

CRF multitag sequence score
gather_tree

Gather tree
img_cutout

Cutout
gather_tree_from_array

Gather tree from array
crf_unary_score

CRF unary score
layer_maxout

Maxout layer
img_dense_image_warp

Dense image warp
img_random_cutout

Random cutout
decoder_beam_search

BeamSearch sampling decoder
tfaddons_version

Version of TensorFlow SIG Addons
loss_lifted_struct

Lifted structured loss
img_connected_components

Connected components
loss_giou

Implements the GIoU loss function.
loss_hamming

Hamming loss
img_compose_transforms

Compose transforms
crf_decode_forward

CRF decode forward
img_sharpness

Sharpness
img_shear_x

Shear x-axis
img_median_filter2d

Median filter2d
img_mean_filter2d

Mean filter2d
img_random_hsv_in_yiq

Random hsv in yiq
layer_correlation_cost

Correlation Cost Layer.
layer_nas_cell

Neural Architecture Search (NAS) recurrent network cell.
layer_activation_gelu

Gaussian Error Linear Unit
loss_sequence

Weighted cross-entropy loss for a sequence of logits.
layer_multi_head_attention

Keras-based multi head attention layer
loss_sigmoid_focal_crossentropy

Sigmoid focal crossentropy loss
skip_gram_sample_with_text_vocab

Skip gram sample with text vocab
optimizer_conditional_gradient

Conditional Gradient
optimizer_swa

Stochastic Weight Averaging
optimizer_decay_adamw

Optimizer that implements the Adam algorithm with weight decay
decoder_beam_search_output

Beam Search Decoder Output
img_translate

Translate
decoder_beam_search_state

Beam Search Decoder State
img_adjust_hsv_in_yiq

Adjust hsv in yiq
img_from_4D

From 4D image
img_flat_transforms_to_matrices

Flat transforms to matrices
optimizer_yogi

Yogi
loss_npairs

Npairs loss
img_resampler

Resampler
img_translate_xy

Translate xy dims
img_shear_y

Shear y-axis
img_rotate

Rotate
layer_group_normalization

Group normalization layer
layer_filter_response_normalization

FilterResponseNormalization
layer_norm_lstm_cell

LSTM cell with layer normalization and recurrent dropout.
layer_poincare_normalize

Project into the Poincare ball with norm <= 1.0 - epsilon
sample_bernoulli

Bernoulli sample
sample_categorical

Categorical sample
loss_sparsemax

Sparsemax loss
loss_npairs_multilabel

Npairs multilabel loss
img_sparse_image_warp

Sparse image warp
img_unwrap

Uwrap
viterbi_decode

Viterbi decode
img_translations_to_projective_transforms

Translations to projective transforms
img_interpolate_spline

Interpolate spline
img_angles_to_projective_transforms

Angles to projective transforms
img_matrices_to_flat_transforms

Matrices to flat transforms
loss_triplet_semihard

Triplet semihard loss
layer_sparsemax

Sparsemax activation function
img_blend

Blend
parse_time

Parse time
loss_triplet_hard

Triplet hard loss
loss_pinball

Pinball loss
layer_weight_normalization

Weight Normalization layer
metric_multilabel_confusion_matrix

MultiLabelConfusionMatrix
metric_mcc

MatthewsCorrelationCoefficient
metric_cohen_kappa

Computes Kappa score between two raters
register_custom_kernels

Register custom kernels
register_all

Register all
optimizer_lamb

Layer-wise Adaptive Moments
optimizer_decay_sgdw

Optimizer that implements the Momentum algorithm with weight_decay
img_wrap

Wrap
sampler_greedy_embedding

Greedy Embedding Sampler
sampler_inference

Inference Sampler
safe_cumprod

Safe cumprod
skip_gram_sample

Skip gram sample
sampler_sample_embedding

Sample Embedding Sampler
install_tfaddons

Install TensorFlow SIG Addons
lookahead_mechanism

Lookahead mechanism
optimizer_novograd

NovoGrad
optimizer_radam

Rectified Adam (a.k.a. RAdam)
metric_rsquare

RSquare This is also called as coefficient of determination. It tells how close are data to the fitted regression line. Highest score can be 1.0 and it indicates that the predictors perfectly accounts for variation in the target. Score 0.0 indicates that the predictors do not account for variation in the target. It can also be negative if the model is worse.
reexports

Objects exported from other packages
metric_fbetascore

FBetaScore
loss_contrastive

Contrastive loss
metric_hamming_distance

Hamming distance
sampler

Sampler
sampler_scheduled_output_training

Scheduled Output Training Sampler
sampler_scheduled_embedding_training

A training sampler that adds scheduled sampling
sampler_training

A Sampler for use during training.
sampler_custom

Base abstract class that allows the user to customize sampling.
metrics_f1score

F1Score
optimizer_lazy_adam

Lazy Adam
optimizer_moving_average

Moving Average
tile_batch

Tile batch
register_keras_objects

Register keras objects
activation_mish

Mish