pytorch
diff --git a/‎examples/contrib/transformers/README.md
+106 b/‎examples/contrib/transformers/README.md
+106
diff --git a/‎examples/contrib/transformers/dataset.py
+34 b/‎examples/contrib/transformers/dataset.py
+34
@@ -0,0 +1,106 @@
+# Transformers Example with Ignite
+
+In this example, we show how to use _Ignite_ to finetune a transformer model:
+
+- on 1 or more GPUs or TPUs
+- compute training/validation metrics
+- log learning rate, metrics etc
+- save the best model weights
+
+Configurations:
+
+- [x] single GPU
+- [x] multi GPUs on a single node
+- [x] TPUs on Colab
+
+## Requirements:
+
+- pytorch-ignite: `pip install pytorch-ignite`
+- [transformers](https://github.com/huggingface/transformers): `pip install transformers`
+- [datasets](https://github.com/huggingface/datasets): `pip install datasets`
+- [tqdm](https://github.com/tqdm/tqdm/): `pip install tqdm`
+- [tensorboardx](https://github.com/lanpa/tensorboard-pytorch): `pip install tensorboardX`
+- [python-fire](https://github.com/google/python-fire): `pip install fire`
+- Optional: [clearml](https://github.com/allegroai/clearml): `pip install clearml`
+
+Alternatively, install the all requirements using `pip install -r requirements.txt`.
+
+## Usage:
+
+Run the example on a single GPU:
+
+```bash
+python main.py run
+```
+If needed, please, adjust the batch size to your GPU device with `--batch_size` argument.
+
+For details on accepted arguments:
+
+```bash
+python main.py run -- --help
+```
+
+
+### Distributed training
+
+#### Single node, multiple GPUs
+
+Let's start training on a single node with 2 gpus:
+
+```bash
+# using torch.distributed.launch
+python -u -m torch.distributed.launch --nproc_per_node=2 --use_env main.py run --backend="nccl"
+```
+
+or
+
+```bash
+# using function spawn inside the code
+python -u main.py run --backend="nccl" --nproc_per_node=2
+```
+
+##### Using [Horovod](https://horovod.readthedocs.io/en/latest/index.html) as distributed backend
+
+Please, make sure to have Horovod installed before running.
+
+Let's start training on a single node with 2 gpus:
+
+```bash
+# horovodrun
+horovodrun -np=2 python -u main.py run --backend="horovod"
+```
+
+or
+
+```bash
+# using function spawn inside the code
+python -u main.py run --backend="horovod" --nproc_per_node=2
+```
+
+#### Colab or Kaggle kernels, on 8 TPUs
+
+```python
+# setup TPU environment
+import os
+assert os.environ['COLAB_TPU_ADDR'], 'Make sure to select TPU from Edit > Notebook settings > Hardware accelerator'
+```
+```bash
+VERSION = "nightly"
+!curl -q https://raw.githubusercontent.com/pytorch/xla/master/contrib/scripts/env-setup.py -o pytorch-xla-env-setup.py
+!python pytorch-xla-env-setup.py --version $VERSION > /dev/null
+```
+
+```python
+from main import run
+run(backend="xla-tpu", nproc_per_node=8)
+```
+
+## ClearML fileserver
+
+If `ClearML` server is used (i.e. `--with_clearml` argument), the configuration to upload artifact must be done by
+modifying the `ClearML` configuration file `~/clearml.conf` generated by `clearml-init`. According to the
+[documentation](https://allegro.ai/clearml/docs/docs/examples/reporting/artifacts.html), the `output_uri` argument can be
+configured in `sdk.development.default_output_uri` to fileserver uri. If server is self-hosted, `ClearML` fileserver uri is
+`http://localhost:8081`.
+
+For more details, see https://allegro.ai/clearml/docs/docs/examples/reporting/artifacts.html
@@ -0,0 +1,34 @@
+import torch
+
+
+class TransformerDataset(torch.utils.data.Dataset):
+    def __init__(self, texts, labels, tokenizer, max_length):
+        self.texts = texts
+        self.labels = labels
+        self.tokenizer = tokenizer
+        self.max_length = max_length
+
+    def __getitem__(self, idx):
+        text = str(self.texts[idx])
+        text = " ".join(text.split())
+        inputs = self.tokenizer.encode_plus(
+            text, None, add_special_tokens=True, max_length=self.max_length, truncation=True
+        )
+
+        ids = inputs["input_ids"]
+        token_type_ids = inputs["token_type_ids"]
+        mask = inputs["attention_mask"]
+        padding_length = self.max_length - len(ids)
+
+        ids = ids + ([0] * padding_length)
+        mask = mask + ([0] * padding_length)
+        token_type_ids = token_type_ids + ([0] * padding_length)
+        return {
+            "input_ids": torch.tensor(ids, dtype=torch.long),
+            "attention_mask": torch.tensor(mask, dtype=torch.long),
+            "token_type_ids": torch.tensor(token_type_ids, dtype=torch.long),
+            "label": torch.tensor(self.labels[idx], dtype=torch.float),
+        }
+
+    def __len__(self):
+        return len(self.labels)