Implementation of the Sparsemax activation function in Pytorch from the paper:
From Softmax to Sparsemax: A Sparse Model of Attention and Multi-Label Classification by André F. T. Martins and Ramón Fernandez Astudillo
Tested in Pytorch 0.4.0
Example usage
import torch
from sparsemax import Sparsemax
sparsemax = Sparsemax(dim=1)
softmax = torch.nn.Softmax(dim=1)
logits = torch.randn(2, 5)
print("\nLogits")
print(logits)
softmax_probs = softmax(logits)
print("\nSoftmax probabilities")
print(softmax_probs)
sparsemax_probs = sparsemax(logits)
print("\nSparsemax probabilities")
print(sparsemax_probs)
Please add an issue if you have questions or suggestions.
From Softmax to Sparsemax: A Sparse Model of Attention and Multi-Label Classification
Note that is a Pytorch port of an existing implementation: https://github.com/gokceneraslan/SparseMax.torch/
DOI for this particular repository:
This implementation was used for Transcoding Compositionally: Using Attention to Find More Generalizable Solutions