Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[wip] support save_hyperparameters after mutation #3622

Closed
wants to merge 1 commit into from

Conversation

tbenst
Copy link

@tbenst tbenst commented Sep 23, 2020

What does this PR do?

Here's a WIP pull request that closes #3494 and closes #3522.

I had hoped this would be a simple one line fix, like getattr(self, x) if hasattr(self,x) else init_args[x] and it nearly was, but upon running tests I found that pytorch-lightning supports a diverse array of hparams, including omega_conf and Namespace, which greatly complicates things.

Please look at the two new tests, test_hparams_inplace_mutation and test_hparams_after_mutation for understanding the intent.

This is my first pull request so greatly appreciate any help or feedback!!

Status

6 tests fail, including one that I wrote!

============================= short test summary info ==============================
FAILED tests/models/test_hparams.py::test_namespace_hparams[SaveHparamsModel] - A...
FAILED tests/models/test_hparams.py::test_dict_hparams[SaveHparamsModel] - Attrib...
FAILED tests/models/test_hparams.py::test_omega_conf_hparams[SaveHparamsModel] - ...
FAILED tests/models/test_hparams.py::test_single_config_models_fail[AnotherArgModel-config0]
FAILED tests/models/test_hparams.py::test_single_config_models_fail[OtherArgsModel-config1]
FAILED tests/models/test_hparams.py::test_hparams_after_mutation - AttributeError...

The test I wrote that fails is surprising to me and appears to be a case of Black Magic, as only the final assert fails.

Before submitting

  • Was this discussed/approved via a Github issue? (no need for typos and docs improvements)
  • Did you read the contributor guideline, Pull Request section?
  • Did you make sure your PR does only one thing, instead of bundling different changes together? Otherwise, we ask you to create a separate PR for every change.
  • Did you make sure to update the documentation with your changes?
  • Did you write any new necessary tests?
  • Did you verify new and existing tests pass locally with your changes?
  • If you made a notable change (that affects users), did you update the CHANGELOG?

PR review

@edenlightning @Borda

@pep8speaks
Copy link

pep8speaks commented Sep 23, 2020

Hello @tbenst! Thanks for updating this PR.

Line 1343:44: E231 missing whitespace after ':'
Line 1346:24: E231 missing whitespace after ','
Line 1347:39: W291 trailing whitespace
Line 1404:75: W291 trailing whitespace
Line 1406:1: W293 blank line contains whitespace
Line 1410:1: W293 blank line contains whitespace
Line 1419:1: W293 blank line contains whitespace

Line 545:1: E303 too many blank lines (3)
Line 555:1: E302 expected 2 blank lines, found 1
Line 566:1: E302 expected 2 blank lines, found 1

Comment last updated at 2020-10-07 09:53:12 UTC

@mergify mergify bot requested a review from a team September 23, 2020 04:49
@mergify
Copy link
Contributor

mergify bot commented Oct 5, 2020

This pull request is now in conflict... :(

@Borda Borda added bug Something isn't working feature Is an improvement or enhancement labels Oct 5, 2020
@Borda Borda added this to the 1.0 milestone Oct 5, 2020
@Borda Borda self-assigned this Oct 5, 2020
@Borda Borda force-pushed the hyperparam_mutation branch from e477877 to 8ce4d2f Compare October 7, 2020 09:53
@edenlightning edenlightning modified the milestones: 1.0, 1.1 Oct 7, 2020
@mergify
Copy link
Contributor

mergify bot commented Oct 10, 2020

This pull request is now in conflict... :(

@Borda Borda modified the milestones: 1.1, 1.0.x Oct 20, 2020
@stale
Copy link

stale bot commented Nov 3, 2020

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed in 7 days if no further activity occurs. If you need further help see our docs: https://pytorch-lightning.readthedocs.io/en/latest/CONTRIBUTING.html#pull-request or ask the assistance of a core contributor here or on Slack. Thank you for your contributions.

@stale stale bot added the won't fix This will not be worked on label Nov 3, 2020
@Borda
Copy link
Member

Borda commented Nov 3, 2020

@tbenst is it ready to review/finish it?

@stale stale bot removed the won't fix This will not be worked on label Nov 3, 2020
@awaelchli
Copy link
Contributor

@tbenst
Could you check my comment here #4316 (comment)
regarding reproducibility? I think it is ok if we add the possibility to call the save function on updated values, however, the user should be aware of the side effects it can cause.

@edenlightning edenlightning modified the milestones: 1.0.x, 1.0.7 Nov 10, 2020
@Borda Borda modified the milestones: 1.0.7, 1.0.x Nov 11, 2020
@edenlightning edenlightning modified the milestones: 1.0.x, 1.0.7 Nov 13, 2020
@Borda Borda modified the milestones: 1.0.7, 1.0.x Nov 13, 2020
@Borda Borda added the priority: 2 Low priority task label Nov 17, 2020
@Borda Borda modified the milestones: 1.0.x, 1.1 Nov 17, 2020
@Borda
Copy link
Member

Borda commented Feb 8, 2021

@tbenst there won't be any more releases in 1.1.x so I have changed your destination branch to feat 1.2
pls, there is no need to close this PR, just rebase on the target branch, or maybe easier just cherry-pick the commit (@kaushikb11 will help you if needed)

@edenlightning edenlightning removed this from the 1.2 milestone Feb 9, 2021
Base automatically changed from release/1.2-dev to master February 11, 2021 14:31
@tchaton
Copy link
Contributor

tchaton commented Feb 26, 2021

Dear @tbenst,

Would you mind resolve the conflicts ?

Best,
T.C

@Borda
Copy link
Member

Borda commented May 11, 2021

@tbenst is this still WIP or ready to review?

@tbenst
Copy link
Author

tbenst commented May 25, 2021

Sorry for my long delay. Finally back working on a project using pytorch lightning. I'm happy to update this PR.

I've updated to latest version, but I'm having trouble getting the auto_lr_find to work at all:

error trace
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
~/code/babelfish/notebooks/lstm_denoise.py in 
     261                     #  default_root_dir=save_dir
     262                      )
---> 263 trainer.tune(model)
     264 # trainer.fit(model, torch.utils.data.DataLoader(dataset, batch_size=None))
     265 trainer.fit(model, dataloader)

~/opt/anaconda3/envs/bf/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py in tune(self, model, train_dataloader, val_dataloaders, datamodule)
    813 
    814         """
--> 815         self.tuner.tune(model, train_dataloader, val_dataloaders, datamodule)
    816 
    817     def call_setup_hook(self, model):

~/opt/anaconda3/envs/bf/lib/python3.8/site-packages/pytorch_lightning/tuner/tuning.py in tune(self, model, train_dataloader, val_dataloaders, datamodule)
     35     def tune(self, model, train_dataloader, val_dataloaders, datamodule):
     36         # setup data, etc...
---> 37         self.trainer.train_loop.setup_fit(model, train_dataloader, val_dataloaders, datamodule)
     38 
     39         # hook

~/opt/anaconda3/envs/bf/lib/python3.8/site-packages/pytorch_lightning/trainer/training_loop.py in setup_fit(self, model, train_dataloader, val_dataloaders, datamodule)
    110 
    111         # check that model is configured correctly
--> 112         self.trainer.config_validator.verify_loop_configurations(model)
    113 
    114     def setup_training(self, model: LightningModule):

~/opt/anaconda3/envs/bf/lib/python3.8/site-packages/pytorch_lightning/trainer/configuration_validator.py in verify_loop_configurations(self, model)
     33         """
     34         if not self.trainer.testing:
---> 35             self.__verify_train_loop_configuration(model)
     36             self.__verify_eval_loop_configuration(model, 'validation')
     37         else:

~/opt/anaconda3/envs/bf/lib/python3.8/site-packages/pytorch_lightning/trainer/configuration_validator.py in __verify_train_loop_configuration(self, model)
     53         # verify model has a train dataloader
     54         # -----------------------------------
---> 55         has_train_dataloader = is_overridden('train_dataloader', model)
     56         if not has_train_dataloader:
     57             raise MisconfigurationException(

~/opt/anaconda3/envs/bf/lib/python3.8/site-packages/pytorch_lightning/utilities/model_utils.py in is_overridden(method_name, model)
     40         is_overridden = instance_attr.patch_loader_code != str(super_attr.__code__)
     41     else:
---> 42         is_overridden = instance_attr.__code__ is not super_attr.__code__
     43     return is_overridden

AttributeError: 'DataLoader' object has no attribute '__code__'

Edit: realize it may not be clear that the original motivation for this PR is that auto_lr_find is a good example of how hparams might be modified post-init & yet we want to save the mutated params.

@codecov
Copy link

codecov bot commented May 25, 2021

Codecov Report

Merging #3622 (8ba6304) into master (9c415d2) will increase coverage by 6%.
The diff coverage is n/a.

❗ Current head 8ba6304 differs from pull request most recent head 8ce4d2f. Consider uploading reports for the commit 8ce4d2f to get more accurate results

@@           Coverage Diff            @@
##           master   #3622     +/-   ##
========================================
+ Coverage      86%     92%     +6%     
========================================
  Files         113     199     +86     
  Lines        8551   12957   +4406     
========================================
+ Hits         7361   11963   +4602     
+ Misses       1190     994    -196     

@carmocca
Copy link
Contributor

Hi @tbenst! Thanks for looking into this PR again.

I have not been able to reproduce any of those issues using our bug report model: https://github.com/PyTorchLightning/pytorch-lightning/blob/master/pl_examples/bug_report_model.py

Please, give it a try and open separate issues for those if you can reproduce. It's easier for us to track it that way.

@edenlightning edenlightning added this to the v1.3.x milestone Jul 1, 2021
@Borda Borda modified the milestones: v1.3.x, v1.4 Jul 6, 2021
@edenlightning edenlightning modified the milestones: v1.4, v1.3.x Jul 6, 2021
@edenlightning edenlightning modified the milestones: v1.3.x, v1.4.x Jul 21, 2021
@carmocca carmocca removed this from the v1.4.x milestone Aug 11, 2021
@carmocca carmocca marked this pull request as draft August 11, 2021 21:37
@cowwoc
Copy link
Contributor

cowwoc commented Aug 23, 2021

What is the status of this PR?

@tchaton tchaton added this to the v1.6 milestone Nov 1, 2021
@carmocca carmocca removed this from the 1.6 milestone Mar 28, 2022
@carmocca
Copy link
Contributor

carmocca commented Nov 8, 2022

Closing this PR as it has become stale. We would agree to merge this if somebody wants to take over the authorship. Thank you for your time!

@carmocca carmocca closed this Nov 8, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working feature Is an improvement or enhancement has conflicts priority: 2 Low priority task
Projects
None yet
8 participants