Improve Comet Logger pickled behavior #2553

Lothiraldan · 2020-07-08T17:56:03Z

What does this PR do?

Hello, I'm working for Comet.ml and we got report that the Comet Logger wasn't working correctly in distributed mode. I was happy to see that the main issue has been solved few days ago: https://github.com/PyTorchLightning/pytorch-lightning/pull/2518/files#diff-a79ed8980a01d44db3ea399541407142. I iterated over that fix to improve the behavior with DDP by saving the experiment ID when pickling, so we can use the same experiment id when recreating, and don't create an Experiment object when trying to set the experiment name or accessing the Logger version.

Delay the creation of the actual experiment object for as long as we can.
Save the experiment id in case an Experiment object is created so we can
continue the same experiment in the sub-processes.
Run pre-commit on the comet file.

* Delay the creation of the actual experiment object for as long as we can. * Save the experiment id in case an Experiment object is created so we can continue the same experiment in the sub-processes. * Run pre-commit on the comet file.

tests/loggers/test_comet.py

pytorch_lightning/loggers/comet.py

awaelchli · 2020-07-08T18:09:40Z

I also added some tests for loggers including comet over here #2502

Make most Comet Logger attribute protected as they might not reflect the final Experiment attributes. Also fix the typo in the test name.

Lothiraldan · 2020-07-08T19:30:28Z

I have pushed a fix for the review comments, thank you! I've take a look at #2502, this should fix the test test_loggers_fit_test[CometLogger] failure right?

I've considered removing the name setter but decided to keep it for backward compatibility sake, I think removing it would make it cleaner.

awaelchli · 2020-07-09T02:19:57Z

pytorch_lightning/loggers/comet.py

+    def name(self) -> Optional[str]:
+        # don't create an experiment if we don't have one
+        return self._experiment.project_name if self._experiment else self._project_name


The tests fail because of this here. If the experiment does not exist and the project name is also None, it will return None, and the trainer tries to os.path.join(..., None), that does not work. I would keep it as before. The same applies to the version.

The previous code was problematic because it would create an experiment object if None exists. In addition, if project_name is None and no project_name is configured by the user, self._experiment.project_name can also returns None.
Wandb logger seems to also return None in that case https://github.com/PyTorchLightning/pytorch-lightning/blob/master/pytorch_lightning/loggers/wandb.py#L140, is there a difference between the Wandb logger and the Comet logger that would make returning None for the Logger name problematic?

I guess the wandb logger in the tests is called to create the experiment before accessing the name. You are right there is no difference in the definition there. None is problematic when the Trainer is trying to concat the path to the logger directory, so it is doing os.path.join(root, name, version, ...) and if some of these are None it throws.

Why do you think it is problematic that the call to e.g. name or version creates an experiment if it does not exist? The fact that the constructor does not immediately create the experiment means it needs to be triggered as soon as the user tries to interact with the logger.

I can think of at least 2 scenarios when implicit experiment creation would be problematic:

In DDP mode, if one process which is not the rank zero one tries to access name or version, it will create an Experiment that will not receives metrics or parameters. This will slow down the process and add noise as the Experiment (almost empty) summary will be displayed at the end of the training.

In DDP mode, if one experiment is created before the actual start of the training, we will have to recreate another experiment in the DDP processes anyway so that will slow down the training and generates extra output too. This particular scenario would still happens if the user tries to access logger.experiment before calling Trainer.fit but I think we should avoid this if we can. I've move the handling of experiment_name to delay the creation of the actual Experiment until we really need it.

As I said before, for the logger name, even if the experiment exists, there is still a chance that the project_name is None. If it is absolutely required for the Logger to returns a non-null string, I will have to think about a solution. Maybe returning a default string.

on rank > 0 the experiments don't get called, see the decorator "rank_zero_experiment".

As I said before, for the logger name, even if the experiment exists, there is still a chance that the project_name is None. If it is absolutely required for the Logger to returns a non-null string, I will have to think about a solution. Maybe returning a default string.

Do you think it would be too strict to make the project/experiment name mandatory?

I think returning a default name would be reasonable too. Some other loggers do that.

I've update the code to returns a default name in case no project_name is set. I've also added a couple of tests, I'm not sure if current build failures are linked to my changes or not, I have the impression they are not, could you confirm it?

I don't think so, try to merge master, then they will be triggered again.

I've updated my PR with your review comments and merged master. I don't think the current failures are linked to my changes.

codecov · 2020-07-10T15:22:54Z

Codecov Report

Merging #2553 into master will decrease coverage by 5%.
The diff coverage is 97%.

@@           Coverage Diff           @@
##           master   #2553    +/-   ##
=======================================
- Coverage      91%     86%    -5%     
=======================================
  Files         109     109            
  Lines        8031    8081    +50     
=======================================
- Hits         7291    6939   -352     
- Misses        740    1142   +402

pytorch_lightning/loggers/comet.py

Borda · 2020-08-07T10:00:17Z

@Lothiraldan how is it going, still wip or ready to review? 🐰

Lothiraldan · 2020-08-07T11:55:53Z

@Borda I've merged master and fixed the conflict. There was still one open discussion (#2553 (comment)) but I guess it is ready for review.

awaelchli

tests are solid! thanks for that!
minor comments, but overall LGTM!!

pytorch_lightning/loggers/comet.py

awaelchli · 2020-08-08T01:12:53Z

tests/loggers/test_comet.py

+    with patch('pytorch_lightning.loggers.comet.CometExperiment') as comet:
+        logger = CometLogger(api_key=api_key, experiment_name=experiment_name,)
+
+        # The experiment object should not exists


could you remove these type of comments, also in the other places below? the assertion below makes it very clear

ethanwharris

Awesome, LGTM 😃

pytorch_lightning/loggers/comet.py

Borda · 2020-08-11T09:41:03Z

@Lothiraldan can you pls allow editing your PR? it seems I cannot accept suggestions...

Co-authored-by: Jirka Borovec <[email protected]>

Lothiraldan · 2020-08-19T11:09:10Z

@Borda I've applied the suggestions and I will remove the extraneous comments. Sorry for the answer time, I was in vacation

mergify · 2020-09-15T12:33:49Z

This pull request is now in conflict... :(

pep8speaks · 2020-09-15T17:26:34Z

Hello @Lothiraldan! Thanks for updating this PR.

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2020-09-18 08:52:27 UTC

Lothiraldan · 2020-09-16T11:08:29Z

The tests error seems unrelated to my changes

tests/loggers/test_comet.py

Co-authored-by: Adrian Wälchli <[email protected]>

Vozf · 2020-10-19T10:36:07Z

Well, this changes are breaking everything if I am already using COMET_EXPERIMENT_KEY environmental variable. I'll make an issue with details and a pull request to fix that

Borda · 2020-10-19T12:32:52Z

Well, this changes are breaking everything if I am already using COMET_EXPERIMENT_KEY environmental variable. I'll make an issue with details and a pull request to fix that

can you just make a PR if you want to work on the fix directly and refer the problem there... :]

Vozf · 2020-10-19T12:34:46Z

I've created a PR already

Lothiraldan · 2020-10-19T14:00:44Z

@Vozf Indeed this use-case was broken by my PR, sorry about that.

Improve Comet Logger pickled behavior

a1d8713

* Delay the creation of the actual experiment object for as long as we can. * Save the experiment id in case an Experiment object is created so we can continue the same experiment in the sub-processes. * Run pre-commit on the comet file.

mergify bot requested a review from a team July 8, 2020 17:56

Lothiraldan marked this pull request as draft July 8, 2020 18:03

awaelchli reviewed Jul 8, 2020

View reviewed changes

tests/loggers/test_comet.py Outdated Show resolved Hide resolved

pytorch_lightning/loggers/comet.py Outdated Show resolved Hide resolved

pytorch_lightning/loggers/comet.py Outdated Show resolved Hide resolved

mergify bot requested a review from a team July 8, 2020 18:08

Handle review comment

31073c2

Make most Comet Logger attribute protected as they might not reflect the final Experiment attributes. Also fix the typo in the test name.

awaelchli reviewed Jul 9, 2020

View reviewed changes

mergify bot requested a review from a team July 9, 2020 02:20

williamFalcon changed the title ~~Improve Comet Logger pickled behavior~~ [WIP] Improve Comet Logger pickled behavior Jul 9, 2020

Lothiraldan added 2 commits July 10, 2020 14:34

Ensure that CometLogger.name and CometLogger.version always returns str

7cfdd01

Merge branch 'master' into comet-logger-better-pickling

05e6648

Lothiraldan added 2 commits July 10, 2020 18:06

Add new test for CometLogger.version behavior

4f016fa

Add new tests for CometLogger.name and CometLogger.version

f89691d

awaelchli reviewed Jul 18, 2020

View reviewed changes

pytorch_lightning/loggers/comet.py Outdated Show resolved Hide resolved

pytorch_lightning/loggers/comet.py Show resolved Hide resolved

pytorch_lightning/loggers/comet.py Show resolved Hide resolved

mergify bot requested a review from a team July 18, 2020 03:14

Lothiraldan added 3 commits July 27, 2020 10:41

Apply review suggestions

c7d7b2d

Merge branch 'master' into comet-logger-better-pickling

fc7e447

Merge branch 'master' into comet-logger-better-pickling

181d6fa

Borda added this to the 0.9.0 milestone Aug 6, 2020

Borda added the feature Is an improvement or enhancement label Aug 7, 2020

Merge branch 'master' into comet-logger-better-pickling

ce5457a

Lothiraldan marked this pull request as ready for review August 7, 2020 11:54

Lothiraldan changed the title ~~[WIP] Improve Comet Logger pickled behavior~~ Improve Comet Logger pickled behavior Aug 7, 2020

awaelchli approved these changes Aug 8, 2020

View reviewed changes

mergify bot requested a review from a team August 8, 2020 01:17

Borda requested review from ethanwharris and rohitgr7 August 8, 2020 06:19

ethanwharris approved these changes Aug 9, 2020

View reviewed changes

mergify bot requested a review from a team August 9, 2020 06:45

Lightning-AI deleted a comment from awaelchli Aug 11, 2020

Lightning-AI deleted a comment from ethanwharris Aug 11, 2020

Borda reviewed Aug 11, 2020

View reviewed changes

pytorch_lightning/loggers/comet.py Outdated Show resolved Hide resolved

pytorch_lightning/loggers/comet.py Outdated Show resolved Hide resolved

mergify bot requested a review from a team August 11, 2020 09:40

Apply suggestions from code review

ec30eb2

Co-authored-by: Jirka Borovec <[email protected]>

Remove extraneous comments in Comet logger tests

eac4a18

edenlightning modified the milestones: 0.9.0, 0.9.x Aug 20, 2020

SkafteNicki approved these changes Sep 15, 2020

View reviewed changes

SkafteNicki added the ready PRs ready to be merged label Sep 15, 2020

Merge branch 'master' into comet-logger-better-pickling

5b4939f

Fix lint issues

45e4cb1

awaelchli approved these changes Sep 17, 2020

View reviewed changes

tests/loggers/test_comet.py Outdated Show resolved Hide resolved

tests/loggers/test_comet.py Outdated Show resolved Hide resolved

tests/loggers/test_comet.py Outdated Show resolved Hide resolved

Apply suggestions from code review

8e38c62

Co-authored-by: Adrian Wälchli <[email protected]>

Borda removed the ready PRs ready to be merged label Sep 18, 2020

Borda merged commit e2af4f1 into Lightning-AI:master Sep 18, 2020

Vozf mentioned this pull request Oct 19, 2020

Comet logger overrides COMET_EXPERIMENT_KEY env variable #4229

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve Comet Logger pickled behavior #2553

Improve Comet Logger pickled behavior #2553

Lothiraldan commented Jul 8, 2020

awaelchli commented Jul 8, 2020

Lothiraldan commented Jul 8, 2020

awaelchli Jul 9, 2020

Lothiraldan Jul 9, 2020

awaelchli Jul 9, 2020

Lothiraldan Jul 9, 2020

awaelchli Jul 9, 2020 •

edited

Loading

awaelchli Jul 9, 2020

Lothiraldan Jul 15, 2020

awaelchli Jul 18, 2020

Lothiraldan Jul 27, 2020

codecov bot commented Jul 10, 2020 •

edited

Loading

Borda commented Aug 7, 2020

Lothiraldan commented Aug 7, 2020

awaelchli left a comment

awaelchli Aug 8, 2020

ethanwharris left a comment

Borda commented Aug 11, 2020

Lothiraldan commented Aug 19, 2020

mergify bot commented Sep 15, 2020

pep8speaks commented Sep 15, 2020 •

edited

Loading

Lothiraldan commented Sep 16, 2020

Vozf commented Oct 19, 2020

Borda commented Oct 19, 2020

Vozf commented Oct 19, 2020

Lothiraldan commented Oct 19, 2020

Improve Comet Logger pickled behavior #2553

Improve Comet Logger pickled behavior #2553

Conversation

Lothiraldan commented Jul 8, 2020

What does this PR do?

awaelchli commented Jul 8, 2020

Lothiraldan commented Jul 8, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

awaelchli Jul 9, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

codecov bot commented Jul 10, 2020 • edited Loading

Codecov Report

Borda commented Aug 7, 2020

Lothiraldan commented Aug 7, 2020

awaelchli left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ethanwharris left a comment

Choose a reason for hiding this comment

Borda commented Aug 11, 2020

Lothiraldan commented Aug 19, 2020

mergify bot commented Sep 15, 2020

pep8speaks commented Sep 15, 2020 • edited Loading

Comment last updated at 2020-09-18 08:52:27 UTC

Lothiraldan commented Sep 16, 2020

Vozf commented Oct 19, 2020

Borda commented Oct 19, 2020

Vozf commented Oct 19, 2020

Lothiraldan commented Oct 19, 2020

awaelchli Jul 9, 2020 •

edited

Loading

codecov bot commented Jul 10, 2020 •

edited

Loading

pep8speaks commented Sep 15, 2020 •

edited

Loading