Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Loss value in the progress bar is wrong when accumulate_grad_batches > 1 #2635

Closed
ibeltagy opened this issue Jul 17, 2020 · 1 comment · Fixed by #2738
Closed

Loss value in the progress bar is wrong when accumulate_grad_batches > 1 #2635

ibeltagy opened this issue Jul 17, 2020 · 1 comment · Fixed by #2738
Labels
bug Something isn't working duplicate This issue or pull request already exists help wanted Open to be worked on

Comments

@ibeltagy
Copy link
Contributor

ibeltagy commented Jul 17, 2020

🐛 Bug

The loss value reported in the progress bar is the_correct_loss_value / accumulate_grad_batches, so this value is wrong when accumulate_grad_batches > 1.

This is happening because here the loss is divided by accumulate_grad_batches, then here the running loss is the mean of these losses.

To fix this, either remove the first line (no division by accumulate_grad_batches) or replace mean with sum in the second line.

To Reproduce

  1. Train any model with accumulate_grad_batches=1 and note the loss value reported in the progress bar
  2. Train the same model with accumulate_grad_batches=2 and half the batch size, now the loss value in the progress bar will be half the value from step 1.

Expected Behaviour

The loss in steps (1) and (2) should be the same

Environment

pytorch-lightning==v0.8.5

@ibeltagy ibeltagy added bug Something isn't working help wanted Open to be worked on labels Jul 17, 2020
ibeltagy added a commit to ibeltagy/pytorch-lightning that referenced this issue Jul 17, 2020
@awaelchli
Copy link
Contributor

duplicate of #2569?

@awaelchli awaelchli added the duplicate This issue or pull request already exists label Jul 19, 2020
ibeltagy added a commit to ibeltagy/pytorch-lightning that referenced this issue Jul 28, 2020
williamFalcon pushed a commit that referenced this issue Jul 28, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working duplicate This issue or pull request already exists help wanted Open to be worked on
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants