Support evaluation on validation and test set and updated MNIST example. #770

ghost · 2020-01-30T08:31:37Z

Before submitting

✅ Was this discussed/approved via a Github issue? (no need for typos, doc improvements)
✅ Did you read the contributor guideline?
❌ Did you make sure to update the docs?
❌ Did you write any new necessary tests?

What does this PR do?

PR review

Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.

Did you have fun?

Kinda 🙃

Borda

it would need some clarification...

pytorch_lightning/trainer/trainer.py

pytorch_lightning/trainer/evaluation_loop.py

pl_examples/basic_examples/gpu_template.py

kuynzereb · 2020-01-30T18:00:23Z

I think that it is a bad idea to change the test() function by adding validation parameter. test() should be independent from the training and validation. And I don't really understand the purpose of this PR. If you want to run evaluation on the validation set you can just make all test_dataloader, test_step() and test_end() to be equal to val_dataloader, validation_step() and validation_end() respectively. Or is there some other intention?

Also it may be the documentation problem. When I just started to use PL it was quite unobvious to me how to evaluate the trained model. Will check it later.

ghost · 2020-01-31T00:42:03Z

@kuynzereb Let me fix those. AFAIK, trainer.test(model) can evaluate the model on test set with test_step() but there's no way to evaluate a model directly on validation set without running training. If there's already such function then this pull request will not be needed.

ghost · 2020-01-31T01:24:42Z

@Borda I've refactored the code. Can you take a look?

kuynzereb · 2020-01-31T07:04:55Z

@xingzhaolee If you want to evaluate the model on the validation set you just need to define all test functions to be equal with val functions. That is, you can do something like:

def test_dataloader(self):
    return self.val_dataloader()

def test_step(self, *args, **kwargs):
    return self.validation_step(*args, **kwargs)

def test_end(self, *args, **kwargs):
    return self.validation_end(*args, **kwargs)

The point is that with test() you can evaluate on the whatever dataset you want. In particular on the validation set. You just need to define all test functions appropriately.

ghost · 2020-01-31T07:42:54Z

@kuynzereb Wouldn't it be better to have an option to run test on validation set rather than forcing user to copy and paste their validation code into test related functions.

Also in the ImageNet example, trainer.run_evaluation() is used which is wrong. trainer.test(model) should be used instead when no training is carried out.

ghost · 2020-01-31T07:50:33Z

@Borda any comments? If it's better to follow the way @kuynzereb mentioned then I'll close this pull request.

kuynzereb · 2020-01-31T08:40:12Z

Also in the ImageNet example, trainer.run_evaluation() is used which is wrong. trainer.test(model) should be used instead when no training is carried out.

Yeah, you are totally right!

@kuynzereb Wouldn't it be better to have an option to run test on validation set rather than forcing user to copy and paste their validation code into test related functions.

Well, it may be indeed a nice option. I actually kinda like your idea to introduce trainer.validate(). But the current implementation looks quite clumsy for me. What do you think about the following refactoring:

Let us introduce something like trainer.mode which can be equal to 'training', 'validating', 'testing' and let us remove old self.testing
When we start training we assign trainer.mode = 'training'.
Inside trainer.validate() we assign trainer.mode = 'validating'.
Inside trainer.test() we assign trainer.mode = 'testing'
Change run_evaluation(self, test) to run_evaluation(self). Inside run_evaluation() we will make different things depending on the trainer.mode. (Moreover we can rename run_evaluation to _run_evaluation so the user is aware that it is a private method).

ghost · 2020-01-31T09:25:21Z

@kuynzereb seems like a good way to encourage user to use the new .validate() or the existing .test() instead of run_evaluation() which may cause confusion for a lot of users (I guess that's why the ImageNet example got it wrong). I'll update it over the weekends! 😃

Borda · 2020-01-31T09:28:10Z

I would not add too much complexity, I like the idea with method validate, but with a data loader as a parameter so then you can use it generally... test or validation is the same thing in principle, you just dray different data basket... @williamFalcon ^^

ghost · 2020-01-31T09:48:14Z

@Borda sorry I misunderstood. Edited the comments. You meant something like this?

model = Model(hparams)
trainer = pl.Trainer()
trainer.test(model, model.val_dataloader)

hmm, both ways seems fine to me.

ghost · 2020-02-03T06:19:36Z

@Borda Can you take a look at this? I'm separating validation and testing in case they have different evaluation methods. Will add on the option for new data loader if it's alright

Borda

are there more option than testing and validation?
consider enum the two cases, see https://docs.python.org/3/library/enum.html

pytorch_lightning/trainer/distrib_parts.py

pytorch_lightning/overrides/data_parallel.py

ghost · 2020-02-04T09:16:40Z

@Borda for now I think only validation and testing. If there's anymore it can be added in the future. Updated to use enum.

Borda

Cool! I like this trainer mode very much and with enum it much cleaner to see/read...

pytorch_lightning/trainer/evaluation_loop.py

kuynzereb · 2020-02-14T10:18:14Z

If I understand correctly, there is an error. We set TrainerMode.TRAINING only during training initialization and not inside fit(). It will mean, that if we call first trainer.test() and then trainer.fit() it will work in testing mode instead of training.

And we cannot set TrainerMode.TRAINING in fit() because it is internal function being used for test() and validate(). And I can think of the following solution: let us rename fit() to _fit() and add new fit() which will first set mode to TrainerMode.TRAINING and then call _fit(). Accordingly, test() and validate() will have to call _fit().

Borda · 2020-02-14T11:39:50Z

@kuynzereb could you make a review and point out the bug in code... Thx

pytorch_lightning/trainer/trainer.py

ghost · 2020-02-15T07:51:34Z

If I understand correctly, there is an error. We set TrainerMode.TRAINING only during training initialization and not inside fit(). It will mean, that if we call first trainer.test() and then trainer.fit() it will work in testing mode instead of training.

my bad. Didn't consider that scenario. Fixed.

pep8speaks · 2020-02-16T05:28:00Z

Hello @xingzhaolee! Thanks for updating this PR.

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2020-03-23 03:07:44 UTC

pytorch_lightning/overrides/data_parallel.py

awaelchli · 2020-03-20T02:33:19Z

@xingzhaolee The tests fail because the profiler overhead increased beyond the tolerance set in the tests, perhaps because of the additional validation logic.
@jeremyjordan do we need to increase the tolerance in tests? how was it chosen?

awaelchli · 2020-03-20T02:34:58Z

maybe see what the timings are on master and compare with this branch to determine if the extra overhead is significant

Borda · 2020-03-20T07:25:49Z

@xingzhaolee @awaelchli just rerun the CI and everything is fine now...

pytorch_lightning/trainer/trainer.py

tullie

I'm approving but please consider my comments! :)

awaelchli · 2020-03-21T01:51:18Z

Hi @xingzhaolee, when you rebase/merge master you will probably get docs build errors. Let me know if you need help resolving these :)

docs/source/weights_loading.rst

pytorch_lightning/trainer/trainer.py

jeremyjordan · 2020-03-21T02:41:22Z

pytorch_lightning/__init__.py

@@ -26,8 +26,8 @@
    from logging import getLogger
    _logger = getLogger("lightning")

+    from pytorch_lightning.trainer import Trainer  # Initiaized first due to state


can you explain this? it feels like requiring imports in a certain order would lead to bugs slipping into the codebase more easily

I did a quick test and I think it is because pytorch_lightning.trainer.state.TrainerMode is included in a cyclic import.
For example, if I move TrainerMode to pytorch_lightning.overrides.data_parallel, then the import order highlighted here doesn't matter (tests don't break in either case).
I'm not saying it should go there but I would try to move it out of the import loop and make it so that the import order does not matter.

It's due to where state.py is located. Any suggestions on whether I should move it out like what @awaelchli said or should I keep it as it is for now?

hmm yeah it'd be preferable if we didn't have to rely on import order for things to work properly. do you know where the import cycle is occuring?

# pytorch_lightning/trainer/state.py (TrainerMode class is defined) # pytorch_lightning/__init__.py from pytorch_lightning.trainer import Trainer from pytorch_lightning.core import LightningModule # pytorch_lightning/overrides/data_parallel.py from pytorch_lightning.trainer.state import TrainerMode # pytorch_lightning/trainer/evaluation_loop.py from pytorch_lightning.trainer.state import TrainerMode # pytorch_lightning/trainer/trainer.py from pytorch_lightning.trainer.state import TrainerMode

just looking at the imports from this PR i'm not seeing it

I noticed this:

Flip the imports @jeremyjordan highlighted (LightningModule first, then Trainer)

Open interactive python shell and import TrainerMode:
from pytorch_lightning.trainer.state import TrainerMode

It tries to import LightningDistributedDataParallel, TrainerDataLoadingMixin and LightningModule but fails.

It must be a problem with imports in pytorch_lightning/__init__.py or pytorch_lightning/trainer/__init__.py

I think I found the cycle:
When we do
from pytorch_lightning.trainer.state import TrainerMode it will run the init from pytorch_lightning, which imports Trainer. Trainer tries to import
from pytorch_lightning.trainer.state import TrainerMode and so on ...

I'll take a look at it tomorrow and fix it so that import order is not relied on. :)

I think it's either TrainerMode is moved out of trainer or TrainerMode needs to be imported first like:

from pytorch_lightning.trainer.state import TrainerMode from pytorch_lightning.core import LightningModule from pytorch_lightning.trainer import Trainer from pytorch_lightning.callbacks import Callback from pytorch_lightning.core import data_loader

however, order still matters even in this case. the main issue lies with the import in overrides/data_parallel.py. any suggestions? @jeremyjordan @awaelchli @Borda

I would stay outside and call it states as the trainer status will be added

williamFalcon · 2020-03-23T07:46:43Z

pytorch_lightning/trainer/evaluation_loop.py

-To ensure you don't accidentally use test data to guide training decisions Lightning
- makes running the test set deliberate.
+To ensure you don't accidentally use validation or test data to guide training decisions Lightning
+ makes running the validation or test set deliberate.


this doesn't make sense.... validation should run within training.
This change here is wrong. @Borda @jeremyjordan

williamFalcon · 2020-03-23T07:49:37Z

I don't really understand this PR and I think the functionality doesn't make sense.

Validation by definition is tied to training... it's a way of stopping training. It shouldn't be run separately.

It is NOT like .test(). Test is required to be run separately as a best practice.

This PR shouldn't be accepted as I'm not sure this is needed unless I'm missing something
@PyTorchLightning/core-contributors

ghost · 2020-03-23T10:08:02Z

if validation should not be allow to run without training then this PR won't be needed.

williamFalcon · 2020-03-23T11:07:49Z

in what instance would you want to do that? maybe it is for some particular research use case?

ghost · 2020-03-23T11:43:11Z

It’s more of a general use case. Let’s say:

During some model training the output goes through a sigmoid and a 0.5 threshold is set as prediction for validation. After training I might want to test is out with a different threshold, it would be easier if validation can be run again without training.
I lost my training log but have my model and would like to rerun it on validation set without needing to change the test function for validation purposes.

But of course it’s possible to have both of those outside the Lightning module if that should be the case.

daniilhayrapetyan · 2020-12-24T11:27:14Z

Will also add one use case.
When doing fine-tuning, you might want to be able to compare the validation metrics with the same metrics before training was started.
I imagine it something like this

# set global_step = -1 so the logs are not rewritten by trainer.fit
trainer.global_step = -1
# log validation metrics on the original model
trainer.validate()
# restore global_step, not sure if it is needed
trainer.global_step = 0

trainer.fit(model)

I am new to PyTorchLightning, so my ideas might be wrong.
But still I need some way to compare the validation metrics before and during the training to evaluate the progress. And this solution seems an easy way to do that.

williamFalcon · 2020-12-24T11:30:29Z

this already happens without having to do this.

set the number of sanity validation batches to -1 and it will log the full val before training

daniilhayrapetyan · 2020-12-24T11:34:32Z

OMG. That was the quickest response I have ever gotten on Github! Thanks.

daniilhayrapetyan · 2020-12-24T12:29:56Z

I am not sure if this is intended behaviour or I am doing something wrong.
I can't manage to run self.log for my LightningModule so that it records logs on sanity validation phase.

ghost requested review from Borda, williamFalcon and jeffling January 30, 2020 08:33

Borda requested changes Jan 30, 2020

View reviewed changes

Borda added feature Is an improvement or enhancement information needed labels Jan 31, 2020

Borda added this to the 0.6.1 milestone Jan 31, 2020

Borda requested changes Feb 3, 2020

View reviewed changes

pytorch_lightning/trainer/distrib_parts.py Outdated Show resolved Hide resolved

pytorch_lightning/overrides/data_parallel.py Outdated Show resolved Hide resolved

Borda approved these changes Feb 14, 2020

View reviewed changes

pytorch_lightning/trainer/evaluation_loop.py Outdated Show resolved Hide resolved

Borda added the ready PRs ready to be merged label Feb 14, 2020

Borda removed the ready PRs ready to be merged label Feb 14, 2020

kuynzereb suggested changes Feb 14, 2020

View reviewed changes

pytorch_lightning/trainer/trainer.py Show resolved Hide resolved

Borda mentioned this pull request Feb 15, 2020

[blocked by #770] Cleaning subsequent calls in Trainer [wip] #757

Closed

4 tasks

abdullah-alnahas reviewed Feb 16, 2020

View reviewed changes

pytorch_lightning/overrides/data_parallel.py Show resolved Hide resolved

Borda added the ready PRs ready to be merged label Mar 20, 2020

tullie suggested changes Mar 20, 2020

View reviewed changes

pytorch_lightning/trainer/trainer.py Outdated Show resolved Hide resolved

pytorch_lightning/trainer/trainer.py Outdated Show resolved Hide resolved

pytorch_lightning/trainer/trainer.py Outdated Show resolved Hide resolved

tullie approved these changes Mar 20, 2020

View reviewed changes

jeremyjordan reviewed Mar 21, 2020

View reviewed changes

Xing Zhao LEE added 2 commits March 21, 2020 11:25

Typo

626a3d5

Remove duplicated codes in validate and test

9c96fcc

jeremyjordan mentioned this pull request Mar 21, 2020

Avoid running on_training_end after Keyboard Interrupt #1099

Closed

Xing Zhao LEE added 3 commits March 23, 2020 10:15

Remove .fit

98c9d70

Use fixed value in enum

48c3490

Merge remote-tracking branch 'upstream/master'

a0d46dd

williamFalcon reviewed Mar 23, 2020

View reviewed changes

Borda removed the ready PRs ready to be merged label Mar 23, 2020

ghost closed this Mar 23, 2020

ghost mentioned this pull request Mar 29, 2020

Refactor fit/train/run_pretrain_routine/evaluate/test #1195

Closed

ghost mentioned this pull request Jul 6, 2020

How can I perform only validation without training #2481

Closed

Borda modified the milestones: v0.7., v0.7.x Apr 18, 2021

mvinyard mentioned this pull request Aug 8, 2022

element 0 of tensors does not have a grad_fn #2427

Closed

This pull request was closed.

Support evaluation on validation and test set and updated MNIST example. #770

Support evaluation on validation and test set and updated MNIST example. #770

Conversation

ghost commented Jan 30, 2020 • edited by ghost Loading

Before submitting

What does this PR do?

PR review

Did you have fun?

Borda left a comment

Choose a reason for hiding this comment

kuynzereb commented Jan 30, 2020

ghost commented Jan 31, 2020

ghost commented Jan 31, 2020

kuynzereb commented Jan 31, 2020

ghost commented Jan 31, 2020 • edited by ghost Loading

ghost commented Jan 31, 2020

kuynzereb commented Jan 31, 2020

ghost commented Jan 31, 2020

Borda commented Jan 31, 2020

ghost commented Jan 31, 2020 • edited by ghost Loading

ghost commented Feb 3, 2020

Borda left a comment

Choose a reason for hiding this comment

ghost commented Feb 4, 2020

Borda left a comment

Choose a reason for hiding this comment

kuynzereb commented Feb 14, 2020 • edited Loading

Borda commented Feb 14, 2020

ghost commented Feb 15, 2020

pep8speaks commented Feb 16, 2020 • edited Loading

Comment last updated at 2020-03-23 03:07:44 UTC

awaelchli commented Mar 20, 2020

awaelchli commented Mar 20, 2020

Borda commented Mar 20, 2020

tullie left a comment

Choose a reason for hiding this comment

awaelchli commented Mar 21, 2020

Choose a reason for hiding this comment

awaelchli Mar 21, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

awaelchli Mar 22, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

williamFalcon commented Mar 23, 2020

ghost commented Mar 23, 2020

williamFalcon commented Mar 23, 2020

ghost commented Mar 23, 2020

daniilhayrapetyan commented Dec 24, 2020

williamFalcon commented Dec 24, 2020

daniilhayrapetyan commented Dec 24, 2020

daniilhayrapetyan commented Dec 24, 2020

ghost commented Jan 30, 2020 •

edited by ghost

Loading

ghost commented Jan 31, 2020 •

edited by ghost

Loading

ghost commented Jan 31, 2020 •

edited by ghost

Loading

kuynzereb commented Feb 14, 2020 •

edited

Loading

pep8speaks commented Feb 16, 2020 •

edited

Loading

awaelchli Mar 21, 2020 •

edited

Loading

awaelchli Mar 22, 2020 •

edited

Loading