[Fix] text-classification PL example #6027

bhashithe · 2020-07-25T02:02:17Z

The text-classification example needed to have several edits to get it to working. The main one was that the hparams are loaded as a dict instead of a Namespace object from the checkpoint so this needed to be fixed with recasting the hparams to a Namespace object.

Though this is not the ideal solution, it works for now.

I also have some other fixes such as gpus argument which needed to be added to the generic arguments list in the lightning_base.py and removing the default value for n_tpu_cores. And the lr_scheduler was not accessed by the logging callback correctly. These have been all fixed and the example works correctly.

…ning_base

codecov · 2020-07-25T02:07:09Z

Codecov Report

Merging #6027 into master will increase coverage by 1.04%.
The diff coverage is n/a.

@@            Coverage Diff             @@
##           master    #6027      +/-   ##
==========================================
+ Coverage   77.46%   78.50%   +1.04%     
==========================================
  Files         146      146              
  Lines       26243    26243              
==========================================
+ Hits        20330    20603     +273     
+ Misses       5913     5640     -273

Impacted Files	Coverage Δ
src/transformers/modeling_tf_electra.py	`26.02% <0.00%> (-69.52%)`	⬇️
src/transformers/modeling_tf_roberta.py	`43.98% <0.00%> (-49.38%)`	⬇️
src/transformers/generation_tf_utils.py	`86.21% <0.00%> (+1.25%)`	⬆️
src/transformers/modeling_tf_distilbert.py	`98.79% <0.00%> (+33.89%)`	⬆️
src/transformers/modeling_tf_mobilebert.py	`96.77% <0.00%> (+73.38%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e168488...4fbcde5. Read the comment docs.

sshleifer

Why wasn't any of this breaking tests? Is there some test coverage we could add to find these sorts of errors earlier?

sshleifer · 2020-07-25T18:35:44Z

examples/text-classification/run_pl.sh

@@ -23,7 +23,7 @@ mkdir -p $OUTPUT_DIR
 # Add parent directory to python path to access lightning_base.py
 export PYTHONPATH="../":"${PYTHONPATH}"

-python3 run_pl_glue.py --data_dir $DATA_DIR \
+python3 run_pl_glue.py --gpus 2 --data_dir $DATA_DIR \


should 2 gpus be the default?

it should not.

So should I add environment variables in the shell script just like $DATA_DIR etc.?

DATA_DIR is solid. My only objection is gpus=2.

examples/text-classification/run_pl_glue.py

stas00 · 2020-07-25T18:59:41Z

examples/text-classification/run_pl_glue.py

@@ -72,7 +76,7 @@ def prepare_data(self):
                logger.info("Saving features into cached file %s", cached_features_file)
                torch.save(features, cached_features_file)

-    def load_dataset(self, mode, batch_size):
+    def get_dataloader(self, mode, batch_size):


get_dataloader has one more arg - shuffle

Yes, i think wrapping this load_dataset() into a get_dataloader() is the best way to handle this.

stas00 · 2020-07-25T19:49:37Z

Why wasn't any of this breaking tests? Is there some test coverage we could add to find these sorts of errors earlier?

run_pl_glue.py isn't being tested. It's not in test_examples.py and it doesn't have a dedicated test file like some other examples do.

Besides, looking at the output for examples test:

https://circleci.com/gh/huggingface/transformers/64559?utm_campaign=vcs-integration-link&utm_medium=referral&utm_source=github-build-link

it's impossible to tell which examples are being run and which aren't. It will only indicate the name on failure. Perhaps at the very least adding a pytest option so that it can announce which tests were run? Submitted PR to do just that: #6035

stas00 · 2020-07-25T20:40:00Z

Here is a PR that adds the missing PL glue test: #6034 (which obviously fails by CI - a good thing).

sshleifer · 2020-07-25T22:56:31Z

@stas00 you can at least see all the files that are run with
ls examples/**/test*.py.

stas00 · 2020-07-25T23:14:44Z

@stas00 you can at least see all the files that are run with
ls examples/**/test*.py.

you want one dir up as well, so:

ls -1 examples/test*.py examples/*/test*.py

but it tells only part of the story, since most info is hidden in examples/test_examples.py. e.g. you can't tell pl glue is not being there.

examples/lightning_base.py

sshleifer

Excited for this!

examples/text-classification/run_pl.sh

examples/text-classification/run_pl_glue.py

Co-authored-by: Sam Shleifer <[email protected]>

bhashithe · 2020-07-31T14:59:46Z

How does this break tf tests? Looks like the model save still has issues with the state_dict its saving.

sshleifer · 2020-08-06T19:15:03Z

TF failures are spurious.

sshleifer · 2020-08-06T19:45:32Z

Merging this, thanks @bhashithe, @stas00 @laibamehnaz and everyone else who helped!

stas00 · 2020-08-06T22:37:30Z

The merged #6027 broke examples/seq2seq/test_seq2seq_examples.py::test_finetune_lr_shedulers - which I think was flagged by failing CI of that PR.

yeah, PL already has --gpus - so it conflicts with the one added by 6027. So I will look at how to rework that need in a different way.

…n't fail

stas00 · 2020-08-06T22:55:24Z

Let's continue the discussion here: #6310

bhashithe added 7 commits July 23, 2020 14:36

added gpus parameter and fixed issue with saving hyperparameters

da38b4e

changed load_datasets to get_dataloaders to reflect the same in light…

e618691

…ning_base

removed n_tpu_cores due to an issue in pytorchlightning

bf52911

fixed lr_scheduler issue

3fd5b0c

fixed task not an attribute of hparams

6b4ec05

remomved tpu cores default value

687cf48

styles and quality: for the text-classification pl example

cfa6658

bhashithe changed the title ~~Fixed the text-classification PytorchLightning example #5452~~ Fixed the text-classification PytorchLightning example Jul 25, 2020

bhashithe mentioned this pull request Jul 25, 2020

Text Classification with PyTorch Lightning: 'dict' object has no attribute 'task' #5452

Closed

stas00 mentioned this pull request Jul 25, 2020

examples/text-classification/run_pl.sh multiple problems #6028

Closed

sshleifer reviewed Jul 25, 2020

View reviewed changes

sshleifer self-assigned this Jul 25, 2020

stas00 reviewed Jul 25, 2020

View reviewed changes

This was referenced Jul 25, 2020

Support and preferred conventions for hparams vs **kwargs going forward? Lightning-AI/pytorch-lightning#2385

Closed

add pl_glue example test #6034

Merged

stas00 mentioned this pull request Jul 25, 2020

add a summary report flag for run_examples on CI #6035

Merged

sshleifer reviewed Jul 31, 2020

View reviewed changes

examples/lightning_base.py Show resolved Hide resolved

sshleifer reviewed Jul 31, 2020

View reviewed changes

examples/text-classification/run_pl.sh Outdated Show resolved Hide resolved

Update examples/text-classification/run_pl.sh

68d45ce

sshleifer reviewed Jul 31, 2020

View reviewed changes

examples/text-classification/run_pl_glue.py Outdated Show resolved Hide resolved

Update examples/text-classification/run_pl_glue.py

4fbcde5

Co-authored-by: Sam Shleifer <[email protected]>

sshleifer added the Examples Which is related to examples in general label Aug 6, 2020

sshleifer changed the title ~~Fixed the text-classification PytorchLightning example~~ [Fix] text-classification PL example Aug 6, 2020

sshleifer merged commit ffceef2 into huggingface:master Aug 6, 2020

stas00 mentioned this pull request Aug 6, 2020

fix the shuffle agrument usage and the default #6307

Merged

stas00 added a commit to stas00/transformers that referenced this pull request Aug 6, 2020

skipping the test that was broken by huggingface#6027 so that CI does…

0a83f75

…n't fail

This was referenced Aug 6, 2020

collision between different cl arg definitions in examples #6310

Closed

[pl] restore lr logging behavior for glue, ner examples #6314

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Fix] text-classification PL example #6027

[Fix] text-classification PL example #6027

bhashithe commented Jul 25, 2020

codecov bot commented Jul 25, 2020 •

edited

Loading

sshleifer left a comment

sshleifer Jul 25, 2020

stas00 Jul 25, 2020

bhashithe Jul 25, 2020

sshleifer Jul 31, 2020

stas00 Jul 25, 2020

bhashithe Jul 25, 2020

stas00 commented Jul 25, 2020 •

edited

Loading

stas00 commented Jul 25, 2020

sshleifer commented Jul 25, 2020

stas00 commented Jul 25, 2020

sshleifer left a comment

bhashithe commented Jul 31, 2020

sshleifer commented Aug 6, 2020

sshleifer commented Aug 6, 2020

stas00 commented Aug 6, 2020 •

edited

Loading

stas00 commented Aug 6, 2020 •

edited

Loading

[Fix] text-classification PL example #6027

[Fix] text-classification PL example #6027

Conversation

bhashithe commented Jul 25, 2020

codecov bot commented Jul 25, 2020 • edited Loading

Codecov Report

sshleifer left a comment

Choose a reason for hiding this comment

sshleifer Jul 25, 2020

Choose a reason for hiding this comment

stas00 Jul 25, 2020

Choose a reason for hiding this comment

bhashithe Jul 25, 2020

Choose a reason for hiding this comment

sshleifer Jul 31, 2020

Choose a reason for hiding this comment

stas00 Jul 25, 2020

Choose a reason for hiding this comment

bhashithe Jul 25, 2020

Choose a reason for hiding this comment

stas00 commented Jul 25, 2020 • edited Loading

stas00 commented Jul 25, 2020

sshleifer commented Jul 25, 2020

stas00 commented Jul 25, 2020

sshleifer left a comment

Choose a reason for hiding this comment

bhashithe commented Jul 31, 2020

sshleifer commented Aug 6, 2020

sshleifer commented Aug 6, 2020

stas00 commented Aug 6, 2020 • edited Loading

stas00 commented Aug 6, 2020 • edited Loading

codecov bot commented Jul 25, 2020 •

edited

Loading

stas00 commented Jul 25, 2020 •

edited

Loading

stas00 commented Aug 6, 2020 •

edited

Loading

stas00 commented Aug 6, 2020 •

edited

Loading