-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Logging on slurm stopped working #2317
Comments
Hi! thanks for your contribution!, great first issue! |
Hi, I think I'm having the same problem, running locally logs work correctly (I'm sending to comet), but when I run on a cluster through slurm using Edit: Downgraded to 0.7.6 and it works. |
I think this might be due to an issue due to how the rank id is set, I'm not totally sure, but it could have occurred here: #2231 |
If you want a quick fix, just remove this line. (Dirty solution) |
Fixed by #2339 Please run from master or 0.8.2 on June 25 |
🐛 Bug
Logging and checkpoint saving stopped working for me when I run experiments via slurm system.
I am using
log
keys in return functions:training_epoch_end/validation_epoch_end
.Version 0.7.6 works.
To Reproduce
Steps to reproduce the behaviour:
sbatch ...
Code sample
Expected behaviour
Environment
The text was updated successfully, but these errors were encountered: