Loss value in the progress bar is wrong when accumulate_grad_batches > 1
#2635
Labels
bug
Something isn't working
duplicate
This issue or pull request already exists
help wanted
Open to be worked on
🐛 Bug
The loss value reported in the progress bar is
the_correct_loss_value / accumulate_grad_batches
, so this value is wrong whenaccumulate_grad_batches > 1
.This is happening because here the loss is divided by
accumulate_grad_batches
, then here the running loss is themean
of these losses.To fix this, either remove the first line (no division by
accumulate_grad_batches
) or replacemean
withsum
in the second line.To Reproduce
accumulate_grad_batches=1
and note the loss value reported in the progress baraccumulate_grad_batches=2
and half the batch size, now the loss value in the progress bar will be half the value from step 1.Expected Behaviour
The loss in steps (1) and (2) should be the same
Environment
pytorch-lightning==v0.8.5
The text was updated successfully, but these errors were encountered: