Adding Minimal Reproducible Usage Example For TPU support on examples/seq2seq #5960

AdityaSoni19031997 · 2020-07-22T02:26:14Z

Attempt to resolve #5895.

For using more than a single core, one needs to ensure that enough RAM is available else wait for PyTorch-XLA to release a stable version. They have also released a fix way back that prevent excessive memory usage for nightly.

AdityaSoni19031997 · 2020-07-22T12:59:20Z

The test will obviously break, right?

marton-avrios · 2020-07-25T15:48:24Z

Any progress on merging this?

AdityaSoni19031997 · 2020-07-26T02:22:13Z

they won't accept it as the checks have failed?
But it's expected for the checks to fail as I have modified modeling_bart and it's gonna have one more Param count.

marton-avrios · 2020-07-26T05:54:35Z

Is that so @sshleifer?

sshleifer · 2020-07-26T20:48:37Z

src/transformers/modeling_bart.py

@@ -943,6 +943,7 @@ def __init__(self, config: BartConfig):
        super().__init__(config)
        base_model = BartModel(config)
        self.model = base_model
+        self.lm_head = _make_linear_from_emb(self.model.shared)


I think you need to call this again after line 952 to make the tests pass.

sshleifer · 2020-07-26T20:50:41Z

Thanks for the contribution, this looks awesome!

We can't merge with failing tests, but I think the tests can pass.

Could you also check

RUN_SLOW=1 pytest tests/test_modeling_bart.py
RUN_SLOW=1 pytest tests/test_modeling_marian.py
RUN_SLOW=1 pytest test_modeling_mbart.py

add the USE_CUDA=1 prefix to make them run faster on GPU.

sshleifer · 2020-07-26T21:19:45Z

Actually can we add a support_tpu flag to BartConfig, init it to False, and only allocate lm_head if it's set to True. I'm concerned that we are wasting RAM when we train on GPU. (I would happily change my mind if I encountered evidence that this change doesn't change GPU RAM consumption.)

marton-avrios · 2020-07-27T12:16:29Z

I tried this version and it seems to work but it stucks at "Validation sanity check". Working colab here

AdityaSoni19031997 · 2020-07-27T15:16:13Z

Well I removed the validation check altogether by passing in the concerned flag to 0. Tried debugging to find out what's causing it, but I couldn't figure it out. If you will train and validate, it will work.

…

On Mon, 27 Jul 2020, 17:46 marton-avrios, ***@***.***> wrote: I tried this version and it seems to work but it stucks at "Validation sanity check". — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#5960 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AFNPJJQYIPQDG5ZPYXDTTT3R5VV23ANCNFSM4PEHWJYA> .

stale · 2020-09-26T10:43:42Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

sshleifer · 2020-09-26T17:54:11Z

This is now supported by Seq2SeqTrainer. Use that if you want TPU support!

AdityaSoni19031997 added 3 commits July 22, 2020 07:30

Modifying Bart for TPU support

20b7e3c

tpu_support for examples/seq2seq

c9946c3

Added TPU snip to examples/seq2seq

d92099d

sshleifer reviewed Jul 26, 2020

View reviewed changes

sshleifer mentioned this pull request Aug 3, 2020

Bart: Instatiate lm_head once without wasting memory #5282

Closed

abedkhooli mentioned this pull request Sep 4, 2020

No attribute '_mp_fn' when fine-tuning mbart for en-ro translation task using TPU #6829

Closed

stale bot added the wontfix label Sep 26, 2020

sshleifer closed this Sep 26, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adding Minimal Reproducible Usage Example For TPU support on examples/seq2seq #5960

Adding Minimal Reproducible Usage Example For TPU support on examples/seq2seq #5960

AdityaSoni19031997 commented Jul 22, 2020 •

edited

Loading

AdityaSoni19031997 commented Jul 22, 2020

marton-avrios commented Jul 25, 2020

AdityaSoni19031997 commented Jul 26, 2020

marton-avrios commented Jul 26, 2020

sshleifer Jul 26, 2020

sshleifer commented Jul 26, 2020

sshleifer commented Jul 26, 2020

marton-avrios commented Jul 27, 2020 •

edited

Loading

AdityaSoni19031997 commented Jul 27, 2020 via email

stale bot commented Sep 26, 2020

sshleifer commented Sep 26, 2020 •

edited

Loading

Adding Minimal Reproducible Usage Example For TPU support on examples/seq2seq #5960

Adding Minimal Reproducible Usage Example For TPU support on examples/seq2seq #5960

Conversation

AdityaSoni19031997 commented Jul 22, 2020 • edited Loading

AdityaSoni19031997 commented Jul 22, 2020

marton-avrios commented Jul 25, 2020

AdityaSoni19031997 commented Jul 26, 2020

marton-avrios commented Jul 26, 2020

sshleifer Jul 26, 2020

Choose a reason for hiding this comment

sshleifer commented Jul 26, 2020

sshleifer commented Jul 26, 2020

marton-avrios commented Jul 27, 2020 • edited Loading

AdityaSoni19031997 commented Jul 27, 2020 via email

stale bot commented Sep 26, 2020

sshleifer commented Sep 26, 2020 • edited Loading

AdityaSoni19031997 commented Jul 22, 2020 •

edited

Loading

marton-avrios commented Jul 27, 2020 •

edited

Loading

sshleifer commented Sep 26, 2020 •

edited

Loading