Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

EvalResult doesn't do mean_of_gpus if using TensorMetric #2795

Closed
xiadingZ opened this issue Aug 2, 2020 · 2 comments
Closed

EvalResult doesn't do mean_of_gpus if using TensorMetric #2795

xiadingZ opened this issue Aug 2, 2020 · 2 comments
Labels
bug Something isn't working help wanted Open to be worked on

Comments

@xiadingZ
Copy link

xiadingZ commented Aug 2, 2020

I want to use new AccuracyMetric, it can automatically sync in ddp, but it doesn't divide by word_size. In manually mode, I can divide it by word_size by hand in validation_epoch_end. But if I use EvalResult, how to do this? It only do mean across batches, but no across gpus.
This is original code:

    def validation_epoch_end(self, outputs):
        avg_loss = torch.stack([x['val_loss'] for x in outputs]).mean()
        avg_acc = torch.stack([x['acc'] for x in outputs]).mean()
        avg_acc = avg_acc / self.trainer.world_size

@xiadingZ xiadingZ added the question Further information is requested label Aug 2, 2020
@Borda Borda added bug Something isn't working help wanted Open to be worked on and removed question Further information is requested labels Aug 2, 2020
@Borda
Copy link
Member

Borda commented Aug 2, 2020

@SkafteNicki @justusschock mind have a look?

@SkafteNicki
Copy link
Member

@xiadingZ this was solved in a recent PR (#2568). You can now set the reduce_op='avg' when you construct the metric, and it will calculate the mean and not the sum.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working help wanted Open to be worked on
Projects
None yet
Development

No branches or pull requests

4 participants