For a step-by-step description of the algorithm, see thistutorial.
layer_lstm(object, units, activation = "tanh",
recurrent_activation = "hard_sigmoid", use_bias = TRUE,
return_sequences = FALSE, go_backwards = FALSE, stateful = FALSE,
unroll = FALSE, implementation = 0L,
kernel_initializer = "glorot_uniform",
recurrent_initializer = "orthogonal", bias_initializer = "zeros",
unit_forget_bias = TRUE, kernel_regularizer = NULL,
recurrent_regularizer = NULL, bias_regularizer = NULL,
activity_regularizer = NULL, kernel_constraint = NULL,
recurrent_constraint = NULL, bias_constraint = NULL, dropout = 0,
recurrent_dropout = 0, input_shape = NULL, batch_input_shape = NULL,
batch_size = NULL, dtype = NULL, name = NULL, trainable = NULL,
weights = NULL)
Model or layer object
Positive integer, dimensionality of the output space.
Activation function to use. If you pass NULL
, no
activation is applied (ie. "linear" activation: a(x) = x
).
Activation function to use for the recurrent step.
Boolean, whether the layer uses a bias vector.
Boolean. Whether to return the last output in the output sequence, or the full sequence.
Boolean (default FALSE). If TRUE, process the input sequence backwards and return the reversed sequence.
Boolean (default FALSE). If TRUE, the last state for each sample at index i in a batch will be used as initial state for the sample of index i in the following batch.
Boolean (default FALSE). If TRUE, the network will be unrolled, else a symbolic loop will be used. Unrolling can speed-up a RNN, although it tends to be more memory-intensive. Unrolling is only suitable for short sequences.
one of 0, 1, or 2. If set to 0, the RNN will use an implementation that uses fewer, larger matrix products, thus running faster on CPU but consuming more memory. If set to 1, the RNN will use more matrix products, but smaller ones, thus running slower (may actually be faster on GPU) while consuming less memory. If set to 2 (LSTM/GRU only), the RNN will combine the input gate, the forget gate and the output gate into a single matrix, enabling more time-efficient parallelization on the GPU.
Initializer for the kernel
weights matrix, used
for the linear transformation of the inputs..
Initializer for the recurrent_kernel
weights
matrix, used for the linear transformation of the recurrent state..
Initializer for the bias vector.
Boolean. If TRUE, add 1 to the bias of the forget
gate at initialization. Setting it to true will also force
bias_initializer="zeros"
. This is recommended in Jozefowicz etal.
Regularizer function applied to the kernel
weights matrix.
Regularizer function applied to the
recurrent_kernel
weights matrix.
Regularizer function applied to the bias vector.
Regularizer function applied to the output of the layer (its "activation")..
Constraint function applied to the kernel
weights
matrix.
Constraint function applied to the
recurrent_kernel
weights matrix.
Constraint function applied to the bias vector.
Float between 0 and 1. Fraction of the units to drop for the linear transformation of the inputs.
Float between 0 and 1. Fraction of the units to drop for the linear transformation of the recurrent state.
Dimensionality of the input (integer) not including the samples axis. This argument is required when using this layer as the first layer in a model.
Shapes, including the batch size. For instance,
batch_input_shape=c(10, 32)
indicates that the expected input will be
batches of 10 32-dimensional vectors. batch_input_shape=list(NULL, 32)
indicates batches of an arbitrary number of 32-dimensional vectors.
Fixed batch size for layer
The data type expected by the input, as a string (float32
,
float64
, int32
...)
An optional name string for the layer. Should be unique in a model (do not reuse the same name twice). It will be autogenerated if it isn't provided.
Whether the layer weights will be updated during training.
Initial weights for layer.
Other recurrent layers: layer_gru
,
layer_simple_rnn