Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle KeyboardInterrupt during training #2079

Closed
moi90 opened this issue Jun 5, 2020 · 5 comments · Fixed by #2134
Closed

Handle KeyboardInterrupt during training #2079

moi90 opened this issue Jun 5, 2020 · 5 comments · Fixed by #2134
Labels
feature Is an improvement or enhancement help wanted Open to be worked on

Comments

@moi90
Copy link
Contributor

moi90 commented Jun 5, 2020

🚀 Feature

It should be possible to examine the stack if the training is interrupted manually.

Motivation

In my case, the program hangs at some point and I have to cancel it manually. Because Lightning catches KeyboardInterrupt, I don't get to know where the program was hanging.

Alternatives

  1. Re-raise KeyboardInterrupt. This would break the current behavior. This could be made configurable.
  2. Save result of sys.exc_info() for later examination. This could introduce memory issues like memory leaks or circular references.
  3. Call a handler. The users could then call sys.exc_info() in the handler. This would make memory issues less likely.

I will prepare a pull request for the third option.

@moi90 moi90 added feature Is an improvement or enhancement help wanted Open to be worked on labels Jun 5, 2020
@moi90
Copy link
Contributor Author

moi90 commented Jun 5, 2020

The least invasive change would be to save the result of sys.exc_info() for later examination.

@moi90 moi90 changed the title Re-raise KeyboardInterrupt Handly KeyboardInterrupt during training Jun 9, 2020
@moi90 moi90 changed the title Handly KeyboardInterrupt during training Handle KeyboardInterrupt during training Jun 9, 2020
moi90 added a commit to moi90/pytorch-lightning that referenced this issue Jun 9, 2020
moi90 added a commit to moi90/pytorch-lightning that referenced this issue Jun 9, 2020
Borda pushed a commit to moi90/pytorch-lightning that referenced this issue Jun 9, 2020
@Borda
Copy link
Member

Borda commented Jun 10, 2020

The least invasive change would be to save the result of sys.exc_info() for later examination.

where would you save it, to logs? mind sending a PR?

@moi90
Copy link
Contributor Author

moi90 commented Jun 11, 2020

I already did, see above ;) As I said, using a callback narrows the possibility of memory issues.

Borda pushed a commit to moi90/pytorch-lightning that referenced this issue Jun 11, 2020
Borda pushed a commit to moi90/pytorch-lightning that referenced this issue Jun 14, 2020
Borda added a commit that referenced this issue Jun 15, 2020
* Handle KeyboardInterrupt during training

Fixes #2079.

* chlog

* Fix whitespace

* Update callback_hook.py

* Update base.py

* Update training_loop.py

* Update test_trainer.py

* Update CHANGELOG.md

Co-authored-by: Adrian Wälchli <[email protected]>

* Update CHANGELOG.md

* on_keyboard_interrupt

Co-authored-by: Jirka <[email protected]>
Co-authored-by: William Falcon <[email protected]>
Co-authored-by: Jirka Borovec <[email protected]>
Co-authored-by: Adrian Wälchli <[email protected]>
@SurajDonthi
Copy link
Contributor

SurajDonthi commented Sep 7, 2020

Is this feature up in the latest release - 0.9.0?

I'm looking to automatically perform testing on Keyboard Interrupt but haven't been successful!

@moi90
Copy link
Contributor Author

moi90 commented Sep 10, 2020

git tag --contains fd1693e
0.8.0
0.8.1
0.8.2
0.8.3
0.8.4
0.8.5
0.9.0
0.9.1rc1

So: Yes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature Is an improvement or enhancement help wanted Open to be worked on
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants