LSTM cell with layer normalization and recurrent dropout.
layer_norm_lstm_cell(
object,
units,
activation = "tanh",
recurrent_activation = "sigmoid",
use_bias = TRUE,
kernel_initializer = "glorot_uniform",
recurrent_initializer = "orthogonal",
bias_initializer = "zeros",
unit_forget_bias = TRUE,
kernel_regularizer = NULL,
recurrent_regularizer = NULL,
bias_regularizer = NULL,
kernel_constraint = NULL,
recurrent_constraint = NULL,
bias_constraint = NULL,
dropout = 0,
recurrent_dropout = 0,
norm_gamma_initializer = "ones",
norm_beta_initializer = "zeros",
norm_epsilon = 0.001,
...
)
Model or layer object
Positive integer, dimensionality of the output space.
Activation function to use. Default: hyperbolic tangent (`tanh`). If you pass `NULL`, no activation is applied (ie. "linear" activation: `a(x) = x`).
Activation function to use for the recurrent step. Default: sigmoid (`sigmoid`). If you pass `NULL`, no activation is applied (ie. "linear" activation: `a(x) = x`).
Boolean, whether the layer uses a bias vector.
Initializer for the `kernel` weights matrix, used for the linear transformation of the inputs.
Initializer for the `recurrent_kernel` weights matrix, used for the linear transformation of the recurrent state.
Initializer for the bias vector.
Boolean. If True, add 1 to the bias of the forget gate at initialization. Setting it to true will also force `bias_initializer="zeros"`. This is recommended in [Jozefowicz et al.](http://www.jmlr.org/proceedings/papers/v37/jozefowicz15.pdf)
Regularizer function applied to the `kernel` weights matrix.
Regularizer function applied to the `recurrent_kernel` weights matrix.
Regularizer function applied to the bias vector.
Constraint function applied to the `kernel` weights matrix.
Constraint function applied to the `recurrent_kernel` weights matrix.
Constraint function applied to the bias vector.
Float between 0 and 1. Fraction of the units to drop for the linear transformation of the inputs.
Float between 0 and 1. Fraction of the units to drop for the linear transformation of the recurrent state.
Initializer for the layer normalization gain initial value.
Initializer for the layer normalization shift initial value.
Float, the epsilon value for normalization layers.
List, the other keyword arguments for layer creation.
A tensor
This class adds layer normalization and recurrent dropout to a LSTM unit. Layer normalization implementation is based on: https://arxiv.org/abs/1607.06450. "Layer Normalization" Jimmy Lei Ba, Jamie Ryan Kiros, Geoffrey E. Hinton and is applied before the internal nonlinearities. Recurrent dropout is based on: https://arxiv.org/abs/1603.05118 "Recurrent Dropout without Memory Loss" Stanislau Semeniuta, Aliaksei Severyn, Erhardt Barth.