Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proper support for Pytorch SequentialLR Scheduler #10759

Open
marcm-ml opened this issue Nov 25, 2021 · 7 comments
Open

Proper support for Pytorch SequentialLR Scheduler #10759

marcm-ml opened this issue Nov 25, 2021 · 7 comments
Labels
3rd party Related to a 3rd-party bug Something isn't working lr scheduler
Milestone

Comments

@marcm-ml
Copy link
Contributor

marcm-ml commented Nov 25, 2021

馃悰 Bug

Currently there is a bug when a ReduceLROnPlateau is used inside SequentialLR due to no proper support for this scheduler in Trainer._configure_schedulers. An exception is raised since the monitor metric is not properly passed to the ReduceLROnPlateau scheduler in TrainingEpochLoop._update_learning_rates

Note: Currently, there is a bug in SequentialLR missing an optimizer attribute, see pytorch/pytorch#67406 and #10278. But that should not interfere here.

To Reproduce

run any lightning model with trainer with scheduler setup like:

def configure_optimizers(self):
    optimizer = torch.optim.Adam(self.parameters(), lr=0.01)
    s1= torch.optim.lr_scheduler.ReduceLROnPlateau(optimizer)
    s2= torch.optim.lr_scheduler.ConstantLR(optimizer)
    scheduler = torch.optim.lr_scheduler.SequentialLR(
        optimizer,
        schedulers=[s1, s2],
        milestones=[2]
    )
    scheduler.optimizer = optimizer  # dirty fix for bug in SequentialLR
    return {"optimizer": optimizer,
            "lr_scheduler": scheduler,
            "monitor": "loss"}

Expected behavior

The monitor value should be passed to the underlying ReduceLROnPlateau scheduler.

This is defenitely tricky to achieve as the current way is assuming a fixed scheduler setup for the entire training time, e.g. allows for multiple scheduler but if scheduler1 is changing midway it only works if it is not ReduceLROnPlateau.

Environment

  • PyTorch Lightning Version (e.g., 1.5.0): 1.5.3
  • PyTorch Version (e.g., 1.10): 1.10
  • Python version (e.g., 3.9): 3.9
  • OS (e.g., Linux): Windows
  • CUDA/cuDNN version: 10.2
  • GPU models and configuration: -
  • How you installed PyTorch (conda, pip, source): pip
  • If compiling from source, the output of torch.__config__.show(): -
  • Any other relevant information: -

Additional context

cc @tchaton

@marcm-ml marcm-ml added the bug Something isn't working label Nov 25, 2021
@marcm-ml marcm-ml changed the title Proper support for SequentialLR Proper support for Pytorch SequentialLR Scheduler Nov 25, 2021
@marcm-ml
Copy link
Contributor Author

marcm-ml commented Nov 25, 2021

Related to this but for another issue: There is no support for custom LRScheduler afaik unless they are inherited from _LRScheduler AND require no extra arguments when .step() is called.

In vanilla pytorch this is "solved" as the user is calling .step() directly.

Perhaps the newly introduced customizable loops is a solution for this?

@tchaton tchaton added lr scheduler priority: 1 Medium priority task labels Nov 25, 2021
@tchaton tchaton self-assigned this Nov 25, 2021
@awaelchli
Copy link
Member

Perhaps the newly introduced customizable loops is a solution for this?

I would rather go with "Manual Optimization" before going for custom loops.

In manual optimization mode, the user calls the scheduler step manually, hence you can use a custom one and pass args to its step method. How does that sound for a (temporary?) workaround?

@marcm-ml
Copy link
Contributor Author

marcm-ml commented Nov 25, 2021

great idea, i forgot about that. I guess that solves the custom scheduler issue. But I still think the support for SequentialLR is necessary. I think there is a debate on Pytorch whether the schedule class needs a rewrite, so perhaps one waits for that?

@rohitgr7
Copy link
Contributor

For the main issue regarding ReduceLROnPlateau, SequentialLR doesn't support an additional argument inside .step so we really can't do anything from our side.
https://github.com/pytorch/pytorch/blob/5fdcc20d8d96a6b42387f57c2ce331516ad94228/torch/optim/lr_scheduler.py#L628

Related to this but for another issue: There is no support for custom LRScheduler afaik unless they are inherited from _LRScheduler AND require no extra arguments when .step() is called.

Do you have any scheduler that is inherited from _LRScheduler and requires an additional argument inside the .step method? I have a PR for it: #10249, that will solve the issue but seems to be blocked under the decision whether to support outside schedulers or not which are not inherited from PT _LRScheduler.

@marcm-ml
Copy link
Contributor Author

marcm-ml commented Nov 26, 2021

Well, I wanted to use WarmStart (custom scheduler inherited from _LRScheduler) together with ReduceLROnPlateu. First try was using SequentialLR but as outlined above is impossible to use with ReduceLROnPlateu. Then I simply inherited from ReduceLROnPlateu and modified its step function to include the logic from WarmStart. See here for the code. That works now with PytorchLightning as it is simply a ReduceLROnPlateu in disguise. Although it would be cleaner to use SequentialLR but is now a pytorch issue rather than PL. Maybe a separate issue can be opened there such that SequentialLR can accept arbitrary arguments in step?

@rohitgr7
Copy link
Contributor

yeah, I'd suggest opening an issue in PyTorch and linking it here. We won't close this issue. Once they support it, we will make adjustments here, if required, to make it compatible.

@abbas695
Copy link

@marcm-ml hello , would you please share the code again as the link is now dead . i am trying my hardest to incorporate warp up with reducelronplateu but no success so far

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3rd party Related to a 3rd-party bug Something isn't working lr scheduler
Projects
None yet
Development

No branches or pull requests

6 participants