EvalResult doesn't do mean_of_gpus if using TensorMetric #2795

xiadingZ · 2020-08-02T03:34:39Z

I want to use new AccuracyMetric, it can automatically sync in ddp, but it doesn't divide by word_size. In manually mode, I can divide it by word_size by hand in validation_epoch_end. But if I use EvalResult, how to do this? It only do mean across batches, but no across gpus.
This is original code:

    def validation_epoch_end(self, outputs):
        avg_loss = torch.stack([x['val_loss'] for x in outputs]).mean()
        avg_acc = torch.stack([x['acc'] for x in outputs]).mean()
        avg_acc = avg_acc / self.trainer.world_size

The text was updated successfully, but these errors were encountered:

Borda · 2020-08-02T19:27:36Z

@SkafteNicki @justusschock mind have a look?

SkafteNicki · 2020-08-10T09:35:12Z

@xiadingZ this was solved in a recent PR (#2568). You can now set the reduce_op='avg' when you construct the metric, and it will calculate the mean and not the sum.

xiadingZ added the question Further information is requested label Aug 2, 2020

Borda added bug Something isn't working help wanted Open to be worked on and removed question Further information is requested labels Aug 2, 2020

williamFalcon closed this as completed Aug 15, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

EvalResult doesn't do mean_of_gpus if using TensorMetric #2795

EvalResult doesn't do mean_of_gpus if using TensorMetric #2795

xiadingZ commented Aug 2, 2020

Borda commented Aug 2, 2020

SkafteNicki commented Aug 10, 2020

EvalResult doesn't do mean_of_gpus if using TensorMetric #2795

EvalResult doesn't do mean_of_gpus if using TensorMetric #2795

Comments

xiadingZ commented Aug 2, 2020

Borda commented Aug 2, 2020

SkafteNicki commented Aug 10, 2020