layer_activation_gelu: Gaussian Error Linear Unit
Description
Gaussian Error Linear Unit
Usage
layer_activation_gelu(object, approximate = TRUE, ...)
Arguments
approximate
(bool) Whether to apply approximation
...
additional parameters to pass
Details
A smoother version of ReLU generally used in the BERT or BERT architecture based
models. Original paper: https://arxiv.org/abs/1606.08415