Port "bert multi lingual tpu training (8 cores)" to Ignite #960

vfdev-5 · 2020-04-22T09:32:02Z

🚀 Feature

Recently included example of TPU usage with Ignite includes a training on a single TPU.
Idea is to port this kaggle kernel: https://www.kaggle.com/abhishek/bert-multi-lingual-tpu-training-8-cores to Ignite and include it in Ignite's show-case

Based on #952 (comment)

ahmedo42 · 2021-02-14T23:02:43Z

I'd like to work on it , should we use the same dataset?? and the same BERT model ?? I think the purpose of this is to make users comfortable using ignite with multicore TPU.

Maybe something like this with added featuers of ignite?? PyTorch on Cloud TPUs: MultiCore Training AlexNet on Fashion MNIST .

vfdev-5 · 2021-02-14T23:05:52Z

@ahmedo42 thanks for asking. I agree that the purpose is more about Ignite and multiple TPUs which is more or less covered here : https://github.com/pytorch/ignite/tree/master/examples/contrib/cifar10#colab-on-8-tpus

An NLP example on multiple TPUs possible trained on Kaggle TPUv3 (vs TPUv2 on colab) could be still nice to have in addition to cifar10 example.

What do you think ?

ahmedo42 · 2021-02-14T23:16:54Z

Didn't really know the difference between TPU's on Kaggle and On Colab 😃 .

An NLP example on multiple TPUs possible trained on Kaggle TPUv3

Totally agree , an NLP example is needed , so it should a ported notebook in examples/notebooks using the same data? with some explanations perhaps?

vfdev-5 · 2021-02-14T23:22:52Z

Well, I'm hesitating between two:

notebook is good and could be easily read on Kaggle notebook interface + explanations
script is good as well if user would like to extend it (what is probably more important need for us)

Do you have any NLP background to suggest what would be more interesting to have here ?
Maybe, it would be nice to port an example from hugging face transformers and run everything on TPUs.

ahmedo42 · 2021-02-15T00:13:45Z

notebook is good and could be easily read on Kaggle notebook interface + explanations

well , notebooks could be extended too , people fork notebooks and extend them all the time on Kaggle

script is good as well if user would like to extend it (what is probably more important need for us)

This seems like a best practice , almost all of the huggingface examples are scripts which allow for a higher degree of control from the user's perspective and probably we should do that

Do you have any NLP background to suggest what would be more interesting to have here ?

Well , I think we really need a Transformer example since it's a huge trend and it's the De Facto in NLP right now , so porting an example from huggingface would be a good idea.

vfdev-5 · 2021-02-15T00:16:44Z

Sounds good !

vfdev-5 added the enhancement label Apr 22, 2020

vfdev-5 self-assigned this Apr 22, 2020

This was referenced Apr 22, 2020

Setup CI as running on TPU #963

Closed

Metrics reduction on distributed TPU setting #965

Closed

vfdev-5 removed their assignment Feb 10, 2021

vfdev-5 added the help wanted label Feb 10, 2021

ahmedo42 mentioned this issue Feb 19, 2021

transformers example #1656

Merged

3 tasks

vfdev-5 closed this as completed in #1656 Feb 21, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Port "bert multi lingual tpu training (8 cores)" to Ignite #960

Port "bert multi lingual tpu training (8 cores)" to Ignite #960

vfdev-5 commented Apr 22, 2020 •

edited

Loading

ahmedo42 commented Feb 14, 2021

vfdev-5 commented Feb 14, 2021 •

edited

Loading

ahmedo42 commented Feb 14, 2021 •

edited

Loading

vfdev-5 commented Feb 14, 2021

ahmedo42 commented Feb 15, 2021

vfdev-5 commented Feb 15, 2021

Port "bert multi lingual tpu training (8 cores)" to Ignite #960

Port "bert multi lingual tpu training (8 cores)" to Ignite #960

Comments

vfdev-5 commented Apr 22, 2020 • edited Loading

🚀 Feature

ahmedo42 commented Feb 14, 2021

vfdev-5 commented Feb 14, 2021 • edited Loading

ahmedo42 commented Feb 14, 2021 • edited Loading

vfdev-5 commented Feb 14, 2021

ahmedo42 commented Feb 15, 2021

vfdev-5 commented Feb 15, 2021

vfdev-5 commented Apr 22, 2020 •

edited

Loading

vfdev-5 commented Feb 14, 2021 •

edited

Loading

ahmedo42 commented Feb 14, 2021 •

edited

Loading