OptimizerArgs #1409

rasbt · 2024-05-10T20:50:10Z

This PR unbundles the OptimizerArgs approach from GaLore in #1192.

Todos

rasbt · 2024-05-10T20:55:46Z

Your jsonargparse example has been super helpful for understanding things a bit more @carmocca . Many thanks for this!

But maybe it's because it's Fri evening but my brain is just not working today. I've just been banging my head against how I would get this into the finetuning method's setup.

Adding the optimizer subclass to the parser and then calling the finetune command yields a

  File "/home/zeus/miniconda3/envs/cloudspace/bin/litgpt", line 8, in <module>
    sys.exit(main())
  File "/teamspace/studios/this_studio/litgpt/litgpt/__main__.py", line 145, in main
    fn(**kwargs)
TypeError: setup() got an unexpected keyword argument 'optimizer.class_path'

But we can't add optimizer argument to the finetuning setup signature because it's a duplicate command then. Conceptually, I am kind of stuck here.

Also, how would we get the args in

optimizer = instantiate_class(model.parameters(), init=args["optimizer"])

if we don't pass them on from the main() function in __main.py__. I suppose if I would do the

parser = jsonargparse.ArgumentParser()
parser.add_subclass_arguments(torch.optim.Optimizer, "optimizer", instantiate=False, fail_untyped=False, skip={"params"})
args = parser.parse_args()

in the finetuning script, but then it would erase all the previous arguments.

carmocca · 2024-05-13T10:40:05Z

As far as integrating into the scripts, I would:

Create an optimizer argument in

litgpt/litgpt/finetune/lora.py

Line 40 in 36c6a77

def setup(

To avoid the duplicate registration, you need to skip it when the function arguments are added
https://github.com/omni-us/jsonargparse/blob/2de15ddfb1c02c2f7b3fe913ad11f13c5cb65dff/jsonargparse/_signatures.py#L166

litgpt/litgpt/__main__.py

Line 121 in 36c6a77

subsubcommand_parser.add_function_arguments(v["fn"])

And call instantiate_class here

litgpt/litgpt/finetune/lora.py

Lines 185 to 187 in 36c6a77

    
           optimizer = optimizer_cls( 
        
               trainable_params, lr=train.learning_rate, weight_decay=train.weight_decay, betas=(train.beta1, train.beta2) 
        
           )

This should be enough to unblock you. The not-so-nice thing is that the CLI args structure leaks into the actual script, meaning that users who don't go through the CLI will have to create this dictionary manually.

litgpt/utils.py

rasbt · 2024-05-14T12:21:07Z

Awesome, thanks so much, this was great help! Figured it out now and got it to work. Many thanks, again learned something new!

rasbt · 2024-05-14T14:51:12Z

I now got it to work as follows:

litgpt finetune full \
  --checkpoint_dir checkpoints/EleutherAI/pythia-160m 

# Specify optimizer and optimizer args:
litgpt finetune full \
  --checkpoint_dir checkpoints/EleutherAI/pythia-160m \
  --optimizer  torch.optim.SGD \
  --optimizer.init_args.lr 1000

But I feel like the way I am passing the optimizer kwargs seems a bit hacky. Is this there a built-in/better way to handle it @carmocca ? The thing is that when I pass an --optimizer argument it also passes additional kwargs to the setup:

kwargs = {
    'optimizer.class_path': 'torch.optim.SGD',
    'optimizer.init_args.dampening': 0.0,
    'optimizer.init_args.differentiable': False,
    'optimizer.init_args.foreach': None,
    'optimizer.init_args.lr': 0.001,
    'optimizer.init_args.maximize': False,
    'optimizer.init_args.momentum': 0.0,
    'optimizer.init_args.nesterov': False,
    'optimizer.init_args.weight_decay': 0.0
}

That's why I added the parsing into class_path and init_args:

    optimizer_class_path = None
    optimizer_init_args = {}
    for key, value in list(kwargs.items()):
        if key.startswith("optimizer"):
            if "class_path" in key:
                optimizer_class_path = value
            elif "init_args" in key:
                init_arg_key = key.split(".")[-1]
                optimizer_init_args[init_arg_key] = value
            del kwargs[key]

Everything seems to work, but I wonder if there isn't a better way to do it?

carmocca · 2024-05-21T17:03:41Z

@rasbt I pushed a commit with what I would suggest. The str code path could be improved if we want to expose arguments like the learning rate outside of the CLI, but that should be straightforward to implement.

Also fyi, you don't need to specify the .init_args substring through command line

litgpt/finetune/full.py

rasbt

Wow awesome, thanks so much for simplifying this. Jsonargparse is still a bit of a black box for me.

rasbt · 2024-05-21T18:20:25Z

The only caveat now is that the class path still needs to be specified. I.e., only specifying the learning rate doesn't work

litgpt finetune full  --optimizer.lr 200  --checkpoint_dir checkpoints/EleutherAI/pythia-160m

error: Parser key "optimizer":
  Not a valid subclass of Optimizer. Got value: NestedArg(key='lr', val='200')
  Subclass types expect one of:
  - a class path (str)
  - a dict with class_path entry
  - a dict without class_path but with init_args entry (class path given previously)

And the optimizer always needs to be specified explicitely

litgpt finetune full  --optimizer AdamW --optimizer.lr 200  --checkpoint_dir checkpoints/EleutherAI/pythia-160m

Do you know if that's a jsonargparse thing, @carmocca ? Because we already set a default value in the setup method I was thinking that this is a bit weird.

carmocca

Added AdamW as the CLI default so that you don't have to specify it

config_hub/finetune/tiny-llama/qlora.yaml

litgpt/finetune/adapter.py

litgpt/finetune/adapter_v2.py

Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>

rasbt · 2024-05-22T23:16:54Z

I hope this is ready now @carmocca

carmocca

Very nice job!

extensions/thunder/pretrain.py

tests/test_config_hub.py

tests/test_utils.py

Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>

carmocca · 2024-05-23T15:33:57Z

The azure failure does look real:

>       fit(fabric, devices, state, train_dataloader, val_dataloader, out_dir, tokenizer_dir, train, eval, optimizer)
E       TypeError: fit() takes 9 positional arguments but 10 were given

/__w/6/s/extensions/thunder/pretrain.py:229: TypeError
----------------------------- Captured stderr call -----------------------------
Missing logger folder: /tmp/pytest-of-root/pytest-0/test_pretrain0/out/logs/tensorboard
Seed set to 42
=========================== short test summary info ============================
FAILED tests/test_thunder_pretrain.py::test_pretrain - TypeError: fit() takes 9 positional arguments but 10 were given

rasbt · 2024-05-23T15:37:00Z

It does. Let me investigate ...

rasbt · 2024-05-23T16:27:43Z

Should be fixed for good now @carmocca . I can switch the link to the original tinystories now that you have seen the green checks haha 😆

optimizer args

9271acc

rasbt requested review from awaelchli, carmocca and lantiga as code owners May 10, 2024 20:50

rasbt added the breaking change label May 10, 2024

carmocca reviewed May 13, 2024

View reviewed changes

litgpt/utils.py Outdated Show resolved Hide resolved

Merge branch 'main' into optimizerargs

5ac6c75

rasbt marked this pull request as draft May 14, 2024 11:57

rasbt added 2 commits May 14, 2024 12:14

works

6a5a0de

update class path

057f4c0

optimizer kwargs

93847fc

Carlos suggestion

d828e1e

rasbt commented May 21, 2024

View reviewed changes

litgpt/finetune/full.py Outdated Show resolved Hide resolved

rasbt commented May 21, 2024

View reviewed changes

rasbt added 2 commits May 21, 2024 13:18

Merge branch 'main' into optimizerargs

2f69e37

adamw default

5fd62b0

rasbt added 4 commits May 21, 2024 20:36

update lora and adapter codes

fcabd8f

optimizer args for pretraining

e0b132c

update configs

78a4d18

Merge branch 'main' into optimizerargs

b167dd6

rasbt marked this pull request as ready for review May 21, 2024 22:49

AdamW as CLI default

55a627b

carmocca reviewed May 22, 2024

View reviewed changes

config_hub/finetune/tiny-llama/qlora.yaml Show resolved Hide resolved

litgpt/finetune/adapter.py Outdated Show resolved Hide resolved

litgpt/finetune/adapter_v2.py Outdated Show resolved Hide resolved

Update litgpt/finetune/adapter.py

6486885

Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>

rasbt added 7 commits May 22, 2024 19:16

update-docstrings

76fc545

refactor instantiation

695dbd4

add tests

43373d8

style

393d516

fix thunder test

379e54e

update config hub

eb6ec38

update config hub tests

553a7b5

rasbt added 4 commits May 22, 2024 23:33

thunder update

9ae83b2

fix

00e80e7

update thunder

d3a6d92

fix

20b5453

carmocca approved these changes May 23, 2024

View reviewed changes

extensions/thunder/pretrain.py Outdated Show resolved Hide resolved

tests/test_config_hub.py Outdated Show resolved Hide resolved

tests/test_utils.py Outdated Show resolved Hide resolved

rasbt and others added 2 commits May 23, 2024 10:03

Update tests/test_utils.py

be9c679

Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>

swap out tinystories link

1521246

rasbt added 3 commits May 23, 2024 15:39

add missing function arg

eed1612

test valid tinystories config one more time

e024403

include file in commit

4b4f209

carmocca merged commit 141c8bf into main May 23, 2024
9 checks passed

carmocca deleted the optimizerargs branch May 23, 2024 16:29

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

OptimizerArgs #1409

OptimizerArgs #1409

rasbt commented May 10, 2024 •

edited

rasbt commented May 10, 2024

carmocca commented May 13, 2024 •

edited

rasbt commented May 14, 2024

rasbt commented May 14, 2024 •

edited

carmocca commented May 21, 2024

rasbt left a comment

rasbt commented May 21, 2024

carmocca left a comment

rasbt commented May 22, 2024

carmocca left a comment

carmocca commented May 23, 2024

rasbt commented May 23, 2024

rasbt commented May 23, 2024

OptimizerArgs #1409

OptimizerArgs #1409

Conversation

rasbt commented May 10, 2024 • edited

Todos

rasbt commented May 10, 2024

carmocca commented May 13, 2024 • edited

rasbt commented May 14, 2024

rasbt commented May 14, 2024 • edited

carmocca commented May 21, 2024

rasbt left a comment

Choose a reason for hiding this comment

rasbt commented May 21, 2024

carmocca left a comment

Choose a reason for hiding this comment

rasbt commented May 22, 2024

carmocca left a comment

Choose a reason for hiding this comment

carmocca commented May 23, 2024

rasbt commented May 23, 2024

rasbt commented May 23, 2024

rasbt commented May 10, 2024 •

edited

carmocca commented May 13, 2024 •

edited

rasbt commented May 14, 2024 •

edited