einsum
as the backing computation.This layer can perform einsum calculations of arbitrary dimensionality.
layer_einsum_dense(
object,
equation,
output_shape,
activation = NULL,
bias_axes = NULL,
kernel_initializer = "glorot_uniform",
bias_initializer = "zeros",
kernel_regularizer = NULL,
bias_regularizer = NULL,
kernel_constraint = NULL,
bias_constraint = NULL,
lora_rank = NULL,
...
)
The return value depends on the value provided for the first argument.
If object
is:
a keras_model_sequential()
, then the layer is added to the sequential model
(which is modified in place). To enable piping, the sequential model is also
returned, invisibly.
a keras_input()
, then the output tensor from calling layer(input)
is returned.
NULL
or missing, then a Layer
instance is returned.
Object to compose the layer with. A tensor, array, or sequential model.
An equation describing the einsum to perform.
This equation must be a valid einsum string of the form
ab,bc->ac
, ...ab,bc->...ac
, or
ab...,bc->ac...
where 'ab', 'bc', and 'ac' can be any valid einsum
axis expression sequence.
The expected shape of the output tensor
(excluding the batch dimension and any dimensions
represented by ellipses). You can specify NA
or NULL
for any dimension
that is unknown or can be inferred from the input shape.
Activation function to use. If you don't specify anything,
no activation is applied
(that is, a "linear" activation: a(x) = x
).
A string containing the output dimension(s)
to apply a bias to. Each character in the bias_axes
string
should correspond to a character in the output portion
of the equation
string.
Initializer for the kernel
weights matrix.
Initializer for the bias vector.
Regularizer function applied to the kernel
weights
matrix.
Regularizer function applied to the bias vector.
Constraint function applied to the kernel
weights
matrix.
Constraint function applied to the bias vector.
Optional integer. If set, the layer's forward pass
will implement LoRA (Low-Rank Adaptation)
with the provided rank. LoRA sets the layer's kernel
to non-trainable and replaces it with a delta over the
original kernel, obtained via multiplying two lower-rank
trainable matrices
(the factorization happens on the last dimension).
This can be useful to reduce the
computation cost of fine-tuning large dense layers.
You can also enable LoRA on an existing
EinsumDense
layer by calling layer$enable_lora(rank)
.
Base layer keyword arguments, such as name
and dtype
.
Biased dense layer with einsums
This example shows how to instantiate a standard Keras dense layer using
einsum operations. This example is equivalent to
layer_Dense(64, use_bias=TRUE)
.
input <- layer_input(shape = c(32))
output <- input |>
layer_einsum_dense("ab,bc->ac",
output_shape = 64,
bias_axes = "c")
output # shape(NA, 64)
## <KerasTensor shape=(None, 64), dtype=float32, sparse=False, name=keras_tensor_1>
Applying a dense layer to a sequence
This example shows how to instantiate a layer that applies the same dense
operation to every element in a sequence. Here, the output_shape
has two
values (since there are two non-batch dimensions in the output); the first
dimension in the output_shape
is NA
, because the sequence dimension
b
has an unknown shape.
input <- layer_input(shape = c(32, 128))
output <- input |>
layer_einsum_dense("abc,cd->abd",
output_shape = c(NA, 64),
bias_axes = "d")
output # shape(NA, 32, 64)
## <KerasTensor shape=(None, None, 64), dtype=float32, sparse=False, name=keras_tensor_3>
Applying a dense layer to a sequence using ellipses
This example shows how to instantiate a layer that applies the same dense operation to every element in a sequence, but uses the ellipsis notation instead of specifying the batch and sequence dimensions.
Because we are using ellipsis notation and have specified only one axis, the
output_shape
arg is a single value. When instantiated in this way, the
layer can handle any number of sequence dimensions - including the case
where no sequence dimension exists.
input <- layer_input(shape = c(32, 128))
output <- input |>
layer_einsum_dense("...x,xy->...y",
output_shape = 64,
bias_axes = "y")output # shape(NA, 32, 64)
## <KerasTensor shape=(None, 32, 64), dtype=float32, sparse=False, name=keras_tensor_5>
enable_lora(
rank,
a_initializer = 'he_uniform',
b_initializer = 'zeros'
)
quantize(mode, type_check = TRUE)
kernel
Other core layers:
layer_dense()
layer_embedding()
layer_identity()
layer_lambda()
layer_masking()
Other layers:
Layer()
layer_activation()
layer_activation_elu()
layer_activation_leaky_relu()
layer_activation_parametric_relu()
layer_activation_relu()
layer_activation_softmax()
layer_activity_regularization()
layer_add()
layer_additive_attention()
layer_alpha_dropout()
layer_attention()
layer_auto_contrast()
layer_average()
layer_average_pooling_1d()
layer_average_pooling_2d()
layer_average_pooling_3d()
layer_batch_normalization()
layer_bidirectional()
layer_category_encoding()
layer_center_crop()
layer_concatenate()
layer_conv_1d()
layer_conv_1d_transpose()
layer_conv_2d()
layer_conv_2d_transpose()
layer_conv_3d()
layer_conv_3d_transpose()
layer_conv_lstm_1d()
layer_conv_lstm_2d()
layer_conv_lstm_3d()
layer_cropping_1d()
layer_cropping_2d()
layer_cropping_3d()
layer_dense()
layer_depthwise_conv_1d()
layer_depthwise_conv_2d()
layer_discretization()
layer_dot()
layer_dropout()
layer_embedding()
layer_equalization()
layer_feature_space()
layer_flatten()
layer_flax_module_wrapper()
layer_gaussian_dropout()
layer_gaussian_noise()
layer_global_average_pooling_1d()
layer_global_average_pooling_2d()
layer_global_average_pooling_3d()
layer_global_max_pooling_1d()
layer_global_max_pooling_2d()
layer_global_max_pooling_3d()
layer_group_normalization()
layer_group_query_attention()
layer_gru()
layer_hashed_crossing()
layer_hashing()
layer_identity()
layer_integer_lookup()
layer_jax_model_wrapper()
layer_lambda()
layer_layer_normalization()
layer_lstm()
layer_masking()
layer_max_num_bounding_boxes()
layer_max_pooling_1d()
layer_max_pooling_2d()
layer_max_pooling_3d()
layer_maximum()
layer_mel_spectrogram()
layer_minimum()
layer_mix_up()
layer_multi_head_attention()
layer_multiply()
layer_normalization()
layer_permute()
layer_rand_augment()
layer_random_brightness()
layer_random_color_degeneration()
layer_random_color_jitter()
layer_random_contrast()
layer_random_crop()
layer_random_flip()
layer_random_grayscale()
layer_random_hue()
layer_random_posterization()
layer_random_rotation()
layer_random_saturation()
layer_random_sharpness()
layer_random_shear()
layer_random_translation()
layer_random_zoom()
layer_repeat_vector()
layer_rescaling()
layer_reshape()
layer_resizing()
layer_rnn()
layer_separable_conv_1d()
layer_separable_conv_2d()
layer_simple_rnn()
layer_solarization()
layer_spatial_dropout_1d()
layer_spatial_dropout_2d()
layer_spatial_dropout_3d()
layer_spectral_normalization()
layer_stft_spectrogram()
layer_string_lookup()
layer_subtract()
layer_text_vectorization()
layer_tfsm()
layer_time_distributed()
layer_torch_module_wrapper()
layer_unit_normalization()
layer_upsampling_1d()
layer_upsampling_2d()
layer_upsampling_3d()
layer_zero_padding_1d()
layer_zero_padding_2d()
layer_zero_padding_3d()
rnn_cell_gru()
rnn_cell_lstm()
rnn_cell_simple()
rnn_cells_stack()