Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for memory-efficient and faster optimizers #1364

Open
rasbt opened this issue Apr 27, 2024 · 1 comment
Open

Add support for memory-efficient and faster optimizers #1364

rasbt opened this issue Apr 27, 2024 · 1 comment

Comments

@rasbt
Copy link
Collaborator

rasbt commented Apr 27, 2024

Maybe GaLore (#1192) should be changed from GaloreArgs to OptimizerArgs after all. Then we can also more easily consider other variants such as BAdam (BAdam: A Memory Efficient Full Parameter Training Method for Large Language Models, https://arxiv.org/abs/2404.02827).

The experiments from here look very compelling. And it only adds 1 hyperparameter:

Screenshot 2024-04-27 at 8 36 56 AM
@lantiga
Copy link
Contributor

lantiga commented Apr 29, 2024

Agreed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants