Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Val_loss not available #321

Closed
Menion93 opened this issue Oct 7, 2019 · 2 comments
Closed

Val_loss not available #321

Menion93 opened this issue Oct 7, 2019 · 2 comments
Labels
bug Something isn't working

Comments

@Menion93
Copy link

Menion93 commented Oct 7, 2019

Describe the bug
When I train my network, which has validation steps defined similar to the doc example

def validation_step(self, batch, batch_nb):
        x = torch.squeeze(batch['x'], dim=0).float()
        y = torch.squeeze(batch['y'], dim=0).long()

        output = self.forward(x)
        return {'batch_val_loss': self.loss(output, y),
                'batch_val_acc': accuracy(output, y)}

    def validation_end(self, outputs):
        avg_loss = torch.stack([x['batch_val_loss'] for x in outputs]).mean()
        avg_acc = torch.stack([x['batch_val_acc'] for x in outputs]).mean()

        return {'val_loss': avg_loss, 'val_acc': avg_acc}

with my cusotm EarlyStopCallback

early_stop_callback = EarlyStopping(monitor='val_loss', patience=5)

    tt_logger = TestTubeLogger(
        save_dir=log_dir,
        name="default",
        debug=False,
        create_git_tag=False
    )

    trainer = Trainer(logger=tt_logger,
                      row_log_interval=10,
                      checkpoint_callback=checkpoint_callback,
                      early_stop_callback=early_stop_callback,
                      gradient_clip_val=0.5,
                      gpus=gpus,
                      check_val_every_n_epoch=1,
                      max_nb_epochs=99999,
                      train_percent_check=train_frac,
                      log_save_interval=100,
                     )

the program cannot see my validation metrics:

Early stopping conditioned on metric `val_loss` which is not available. Available metrics are: loss,epoch,batch_nb,v_nb <class 'RuntimeWarning'>

In a previous release running on Windows (now I am on macOS), this behaviour was not happening. But in the previous version, TestTubeLogger was not present

Desktop (please complete the following information):

  • OS: macOS
  • Version: latest
@Menion93 Menion93 added the bug Something isn't working label Oct 7, 2019
@williamFalcon
Copy link
Contributor

williamFalcon commented Oct 7, 2019

the early stop metrics come from “progress_bar” entry:

So, add a key “progress_bar” and val_loss in there

I’ll update the docs

@Menion93
Copy link
Author

Menion93 commented Oct 7, 2019

Thank you this fixed my issue

def validation_end(self, outputs):
        avg_loss = torch.stack([x['batch_val_loss'] for x in outputs]).mean()
        avg_acc = torch.stack([x['batch_val_acc'] for x in outputs]).mean()

        return {
          'val_loss': avg_loss,
          'val_acc': avg_acc, 
          'progress_bar':{'val_loss': avg_loss, 'val_acc': avg_acc }}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants