-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ReduceLROnPlateau does not work with multiple schedulers #1037
Comments
Good catch! |
If PR #941 is merged, it will also solve this problem |
@darwinkim @SkafteNicki pls coordinate your effort #941 #1039 so none of your work gets wasted... ;] maybe make #1039 as followup of #941? |
#941 seems to solve it, and adds many related features |
With #941 being merged now, this can be closed |
I'm not sure if this is solved. I just tried 2 schedulers with 1 optimizer and it didn't work. When I remove my linear scheduler, |
@Laksh1997 could you provide more info:which version of lightning are you using, what does your configure optimizer looks like? |
Latest stable (0.8.1) Using
|
Okay, but what does |
It returns the following:
|
Here is my code more specifically: def build_scheduler_params(scheduler_name, param_set, optimizer: Optimizer):
"""Parses scheduler params"""
pl_scheduler_params = {
"monitor": param_set.pop("monitor", "val_loss"),
"interval": param_set.pop("interval", "epoch"),
"frequency": param_set.pop("frequency", 1),
}
if hasattr(torch.optim.lr_scheduler, scheduler_name):
scheduler_class = getattr(torch.optim.lr_scheduler, scheduler_name)
scheduler = scheduler_class(optimizer, **param_set)
elif hasattr(lr_schedulers, scheduler_name):
scheduler_class = getattr(lr_schedulers, scheduler_name)
scheduler = scheduler_class(optimizer, **param_set)
else:
raise ValueError(
f"Scheduler: {scheduler_name} not available. "
f"Schedulers available are: {get_available_schedulers()}"
)
pl_scheduler_params["scheduler"] = scheduler
return pl_scheduler_params
def build_optimizer(model: nn.Module, config) -> Optimizer:
"""Makes optimizer from model and config"""
optim_kwargs = copy.deepcopy(config.optimizer_kwargs)
optim_class = get_optim_class(config.optimizer)
optim_param_groups = build_optim_param_groups(model, optim_kwargs)
check_valid_param_groups(optim_param_groups, model)
if isinstance(optim_kwargs, list):
optim_kwargs = optim_kwargs[0]
optimizer = optim_class(optim_param_groups, **optim_kwargs)
return optimizer
def build_schedulers(optimizer: Optimizer, config):
"""Makes schedulers from optimizer and config"""
schedulers = []
schedulers_names = config.schedulers
schedulers_kwargs = config.schedulers_kwargs
assert len(schedulers_names) == len(schedulers_kwargs), (
f"Need to have as many schedulers as scheduler param sets! "
f"Got {len(schedulers_names)} of schedulers and "
f"{len(schedulers_kwargs)} of scheduler param sets!"
)
if schedulers_names is not None:
for scheduler_name, param_set in zip(schedulers_names, schedulers_kwargs):
pl_scheduler_params = build_scheduler_params(
scheduler_name, param_set, optimizer
)
schedulers.append(pl_scheduler_params)
return schedulers
def configure_optimizers(model: nn.Module, config) -> PL_EXPECTED_OUTPUT:
"""Configures PL optimizers and schedulers"""
optimizer = build_optimizer(model, config)
schedulers = build_schedulers(optimizer, config)
return [optimizer], schedulers |
@Laksh1997 I do not really see a problem in your code. After experimenting around with My best guess is that this is specific to your model/data. When you have two optimizers, and one is changing the learning rate every step it is very possible that you do not reach a plateau, such that the |
🐛 Bug
PL seems to only pull one
ReduceLROnPlateau
to call it thereduce_lr_on_plateau_scheduler
To Reproduce
One example would be adding two
ReduceLROnPlateau
schedulers to the GAN example.Expected behavior
All
ReduceLROnPlateau
should be recognized as such.A natural solution might be to check if a scheduler is an instance of
ReduceLROnPlateau
for each scheduler in the loop that runslr_scheduler.step()
, instead of having a separateself.reduce_lr_on_plateau_scheduler
The text was updated successfully, but these errors were encountered: