powered by
Sparsemax activation function [1].
activation_sparsemax(logits, axis = -1L)
Input tensor.
Integer, axis along which the sparsemax operation is applied.
Tensor, output of sparsemax transformation. Has the same type and shape as `logits`. Raises: ValueError: In case `dim(logits) == 1`.
ValueError: In case `dim(logits) == 1`.
For each batch `i` and class `j` we have $$sparsemax[i, j] = max(logits[i, j] - tau(logits[i, :]), 0)$$ [1]: https://arxiv.org/abs/1602.02068