Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Command-line argument parsing #756

Closed
dmontagu opened this issue Aug 16, 2019 · 26 comments
Closed

Command-line argument parsing #756

dmontagu opened this issue Aug 16, 2019 · 26 comments

Comments

@dmontagu
Copy link
Contributor

dmontagu commented Aug 16, 2019

Feature Request

I would really like a lightweight CLI-argument parsing class, similar to BaseSettings.


These days I always find myself using BaseSettings whenever I need to read something from the environment. Using this class is so convenient that I have started to eschew the use of arguments for lightweight scripts and instead just reading values from the environment (and using pydantic to parse them).

Taking a step back, though, it seems like it might be nice to just have a built-in class like BaseSettings for parsing command line arguments, maybe BaseCLIArguments? I always find myself struggling to remember the syntax for argparse; I personally think this is a great use case for pydantic.

I know there are lots of other libraries that handle command line argument parsing, but given how little effort I think it would take to integrate this with pydantic, I would much prefer to not need to learn (and remember) yet another tool.

I wouldn't mind if it were highly opinionated in the interest of simplicity; I just want an easy way to get parsed command line arguments using patterns I already use daily (namely, BaseModel).


(Either way I'll probably implement this for myself, but I will put more care into it if there is any interest in upstreaming it. Hence the issue.)

@samuelcolvin
Copy link
Member

I'd never thought about this but it sounds like a great idea.

A few thoughts in no particular order:

  • we can start small and expand, we don't need all the things I'll suggest here or to replicate click to make this useful.
  • It would be very helpful if pydantic's CLI support could auto generate --help output as other libraries do.
  • I think it could just be a Config flag on BaseSettings, most of the scenarios where you'd want to use BaseSettings you might also want CLI support. Or perhaps a new method eg. BaseSettings.parse_cli()?
  • How do we deal with the distinction between arguments and options? Perhaps through a flag on Field() (currently aka Schema()) eg = Field(... arg=True), argument order would then be prescribed by field order. Is there are more elegant way than using Field(arg=...)?
  • I guess List[...] fields can be populated with repeated arguments, eg. --foo green --foo red, lots of CLIs do this but I can't think of one at the moment, maybe not many do actually. Probably not required for the first release of this feature?
  • I guess Dict[str, ...] fields can be populated in a similar way to docker run's -e. Probably not required for the first release of this feature?
  • I think sub-commands might be cool (perhaps using multiple models?), but definitely not required for the first release of this feature?

Since this is just a new feature and shouldn't have any backwards compatibility problems, I think we should delay it until after v1?

@koxudaxi
Copy link
Contributor

I want to use the feature. I think CLI argument parsing match pydantic absolutely.
Pydantic has a lot of functions for CLI parser. (eg. enum, parsing, type-check, required)

I write my thoughts.

  • How do we define an alias? alias has been already used in the argument of Field ( alias is -h for --help)
  • Where did we write descriptions for command and arguments? (eg. Field (..., help='a help message'), Field (..., description='a help message'))
  • Should this feature support Color? I know the option is not required. And pydantic may need an extra dependency to support it. However, The Color make the feature great.

@samuelcolvin
Copy link
Member

  • How do we define an alias? alias has been already used in the argument of Field ( alias is -h for --help)

see #721 (comment), aliases should use cli similar to how environment variables will use env

  • Where did we write descriptions for command and arguments? (eg. Field (..., help='a help message'), Field (..., description='a help message'))

should use description on field, same as schema does now.

  • Should this feature support Color? I know the option is not required. And pydantic may need an extra dependency to support it. However, The Color make the feature great.

You mean ansi colours? If so, I built this when developing python-devtools, it's not that hard at least for unix platforms. We could either make devtools an optional dependency or copy some of that logic. However I don't think it's required for the first release, also I'm not convinced coloured error messages or help would ever be that useful - please suggest a CLI that makes helpful use of colours in these scenarios?

@koxudaxi
Copy link
Contributor

koxudaxi commented Aug 17, 2019

Sorry, I never have read the comment.

Yes, ANSI colors.
I agree that it's not required for the first release.
A few commands have sub-command, options, description, default in help command.
The help command makes a console screen is filled a lot of words.
If CLI draws words with colors, I think it's more readable for us.
But, I don't know another user need my suggestion.

serverless command looks good to me.
ss2019-08-18 2 52 03

@euri10
Copy link
Contributor

euri10 commented Aug 17, 2019 via email

@dmontagu
Copy link
Contributor Author

A few thoughts in no particular order:

  • we can start small and expand, we don't need all the things I'll suggest here or to replicate click to make this useful.
  • It would be very helpful if pydantic's CLI support could auto generate --help output as other libraries do.

Agreed with the above.

  • I think it could just be a Config flag on BaseSettings, most of the scenarios where you'd want to use BaseSettings you might also want CLI support. Or perhaps a new method eg. BaseSettings.parse_cli()?

After some thought, I'm leaning toward a classmethod for this, rather than config. I think it could be nice to be able to choose to from either the environment or the command line (without needing to change the class definition), and I think a method would make it easier to do that flexibly. Also, I think it would be easier to have cli-parsing-specific settings as arguments to the method (rather than extra config settings), such as a bool fallback_to_env which could be used to allow/disallow falling back on values from the environment if not provided.

How do we deal with the distinction between arguments and options? Perhaps through a flag on Field() (currently aka Schema()) eg = Field(... arg=True), argument order would then be prescribed by field order. Is there are more elegant way than using Field(arg=...)?

Another option would be a custom type FlagArgument that parses to a bool in the same way as StrictBool (or the ill-fated RelaxedBool), but signals to the config that it should be read from command line as a flag rather than an argument with a value. Given how ubiquitous Field is, I'm less inclined to add another argument to Field for this very-specific use case, but my opinion isn't too strong on this point.

I guess List[...] fields can be populated with repeated arguments, eg. --foo green --foo red, lots of CLIs do this but I can't think of one at the moment, maybe not many do actually. Probably not required for the first release of this feature?

I think this makes sense.

I guess Dict[str, ...] fields can be populated in a similar way to docker run's -e. Probably not required for the first release of this feature?

I'd be fine with either --dict-arg key1=value1,key2=value2, or --dict-arg key1=value1 --dict-arg key2=value2

I think sub-commands might be cool (perhaps using multiple models?), but definitely not required for the first release of this feature?

Agreed this could be nice; I think there is a lot of room for cool APIs here. I think this could be supported by extra custom types later.

Since this is just a new feature and shouldn't have any backwards compatibility problems, I think we should delay it until after v1?

Agreed.

@enkoder
Copy link

enkoder commented Aug 23, 2019

I would love something like this as well! One request is I have is that the API should be able to take in a string or list of strings instead of assuming that this will be done only as a command line parser (sys.argv). My use case is for chat bots where a command handler would be wrapped by a decorator that parses and passes the given object to the handler.

Something like this !kick 123123 --silent

class KickUserArgs(BaseModel):
    id: int
    silent: FlagArgument()

@cli_parser(KickUserArgs) 
def kick_user_command(ctx, args):
    assert isinstance(args, KickUserArgs)

@dmontagu
Copy link
Contributor Author

@samuelcolvin I've spent some more time thinking about this and I think I'm less interested in

I've actually been using @tiangolo's awesome typer package (which is basically just click with much better static typing, and, in my opinion at least, an improved API) and have found it to ergonomically capture the vast majority of my practical needs working with a CLI.

It doesn't offer the full power of pydantic for parsing, but to the extent that that is even necessary (basic types seem to get you pretty far with CLIs), I think we might be better served by adding pydantic integration there rather than adding CLI integration here.

I'm going to close this issue but I'm happy to reopen it if there is still interest.

@samuelcolvin
Copy link
Member

Agreed.

I hadn't seen typer, looks great.

Perhaps worth creating an issue there and linking to this to at least discuss pydantic integration? Perhaps using #1179?

@tiangolo
Copy link
Member

tiangolo commented Apr 7, 2020

I'm glad you're liking Typer!

I came just to note that Typer's API is clearly inspired by Pydantic, as FastAPI was inspired (and powered) by Pydantic.

@aschreyer
Copy link

First of all thanks for the two great libraries! I have been starting to use both for CLI parsing and configuration management. I think it would be great to have Pydantic integration in Typer given that the former already has all the parsing and validation functionality needed (including JSON parsing). In fact, using

settings = Settings(**context.params)  # BaseSettings instance

already works with CLI examples I have been testing. However, there is redundancy in the type definitions of course.

@mpkocher
Copy link

mpkocher commented Jul 9, 2020

@dmontagu I'm believe I had a similar need for a thin layer that enables Pydantic data models to be loaded from command line args or from JSON files.

This resulted in pydantic-cli with several examples here and is perhaps similar in spirit to Typer.

Here's a simple example:

from pydantic import BaseModel
from pydantic_cli import run_and_exit

class Options(BaseModel):
    input_file: str
    max_records: int


def example_runner(opts: Options) -> int:
    print(f"Mock example running with {opts}")
    return 0


if __name__ == "__main__":
    run_and_exit(Options, example_runner, description=__doc__, version="0.1.0")

@gabe-microsoft
Copy link

@mpkocher pydantic-cli is exactly what my group has been looking for!

We've been interested in using Pydantic, but have a common pattern of using a combination of JSON files and command line args to define settings for scripts (sometimes with the addition of environment vars). I looked into using Pydantic+Typer, but this still requires duplicating the model specification, has some bugs when using non-basic types, and it gets a bit messy to load settings from both JSON and command line (where each may be partial and only the merged results should be checked).

I really like the rich types defined in Pydantic and look forward to seeing pydantic-cli extended to include all types supported by pydantic. I think it would make the most sense for this CLI generation functionality, including JSON settings parsing to be included directly in pydantic.

@samuelcolvin is there any plan to add this feature? I was surprised that this feature request was closed since there seemed to be community interest in the feature and there still remains no complete solution...

@samuelcolvin
Copy link
Member

I haven't tried pydantic-cli yet. I will.

I closed this because of typer, but having used typer I find it annoying - mostly because it's a wrapper of click and limitations caused by that.

I'd love a cli implementation based on pydantic but I'm struggling to provide support for pydantic as it is, another massive feature like CLI support doesn't seem sensible within this repo therefore.

@gabe-microsoft
Copy link

@samuelcolvin Thanks for the update. I empathize with the constraints on your time and really appreciate all of the time and effort that you've been putting into pydantic.

I tried out pydantic-cli yesterday and feel that it does a good job of adding CLI support to simple pydantic models. I'll think more about how to extend it further to support more complex pydantic models and use cases, and will work with @mpkocher to improve pydantic-cli to be a complete CLI solution for pydantic models.

@xkortex
Copy link

xkortex commented Oct 29, 2020

I started a thread over at mpkocher/pydantic-cli#23 but I figured I'd mention it here as well.

It's a super rough PoC just to see if the concept - leveraging a bit of metaprogramming and introspection - has any traction. Theoretically, using a custom type hint that accepts the metadata needed to populate CLI options (argparse.Action in this case) works, I just need to massage the data flow a bit more and remove the redundancy which can be inferred from inspection or the Pydantic schema.

If people think this is a cool approach, I'll keep hacking on it. I actually don't think it would take much code to get CLI functionality into pydantic.

@frederikaalund
Copy link

Pydantic 1.8.0 introduced the an API to set custom "settings sources" (see #2107). I used this API to create a click-based CLI settings source. It's basically a bridge between click and pydantic.

I just noticed this thread and thought that I would mention it as an alternative.

Highlights:

  • Simple (maintainable) implementation (600 lines of code) because click and pydantic do the heavy lifting underneath.
  • Supports nested models via .-separated arguments such as my_app --server.protocol.http.hostname "192.168.0.1"
  • Supports lists (and tuples, sets, etc.) via either:
    • Multi-options: my_app --protocols "http" --protocols "mqtt" --protocols "ftp"
    • JSON-encoding: my_app --protocols ["http", "mqtt", "ftp"]
  • Supports dicts via JSON: my_app --translations '{"repeat": "repetir", "shuffle": "barajar"}'

Because it's just a "settings source", you can easily compose it with other setting sources (environment variables, secrets, JSON, TOML, etc).

Code is here: https://github.com/sbtinstruments/cyto/blob/master/cyto/settings/sources/cli/_cli.py
Tests are here (good way to get an overview): https://github.com/sbtinstruments/cyto/blob/master/tests/settings/sources/test_source_cli.py
See also the discussion at #2485

@mpkocher
Copy link

@frederikaalund Looks good. I've added cyto to the list of CLI libraries that are leveraging type annotations. Best to you on cyto.

https://github.com/mpkocher/pydantic-cli#other-related-tools

@ricosaurus
Copy link

Integration between pydantic and CLI would be very helpful. I've returned to the issue of the lack of coordination between config and CLI in my scientific computing work. I've created a tool to save dependency graphs of expensive nested analyses for my own work in a way that facilitates reproducibility, efficiency, and -- hopefully -- sharing, and want to make it publicly available, but one of the issues I keep having to address is the lack of coordination between a potentially massive and complex configuration space (one that contains all parameters for all dependencies) and CLI. Generally the config has not been my focus, but now need to address it more head on.
I was using vanilla dataclasses being naive to pydantic until last week, and now am thrilled to move to pydantic.
But the integration with CLI remains clunky. I have tried tools to integrate config objects/ files with CLI -- eg https://github.com/omni-us/jsonargparse/tree/4d326b387d52d9a86bb5d611423f60f4559b6b58 -- but jsonargparse doesn't offer much support for dataclasses.
I am now looking @frederikaalund's cyto -- appears promising -- but being new to Click (and pydantic) I will need to struggle a bit to decipher use cases from the tests...

@frederikaalund
Copy link

I am now looking @frederikaalund's cyto -- appears promising -- but being new to Click (and pydantic) I will need to struggle a bit to decipher use cases from the tests...

Glad you like it. 👍 It's on my to-do list to make some documentation for this part of cyto.

@xkortex
Copy link

xkortex commented Apr 5, 2021

Apologies if this isn't totally on topic, but it's been in my liminal space and I figured I would share some of my findings. I've recently done a survey of the current python CLI frameworks out there. The viable libraries I've found are argparse, click, docopt, fire, typer, plumbum, and cleo. I think hydra/omegaconf may also have CLI parsing? Not a comprehensive list.

None seem to really be a drop-in fit for a pydantic-heavy workflow, but of those, cleo fits the best into my expectations of relatively explicit, able to issue arbitrary subcommand trees, and easy to parse the output into pydantic models, using self.argument(), self.option(), self.choice(), etc functions of cleo.Command. The default API parses the docstring to get the options, which might be a bit too magical for some, but there's an under-documented, more manual API as well. So if you aren't digging click's API (personally I'm not a fan of the deeply nested decorators and the context-passing aspect), it might be worth checking out cleo instead. It's what Poetry uses under the hood.

@ricosaurus
Copy link

thanks @xkortex for the CLI thoughts as I am naive -- I've only used argparse, and only glanced at cleo due to its reliance on docstrings which won't work for my use case. I'll look for the more manual API.
Likewise, I didn't think click would work either at first glance, as the decorator method isn't applicable for me either -- need to generate the hundreds or more options from pydantic models, parse/validate with these options, and then update or recreate the models with any new options provided. It looks like @frederikaalund has a way around that however.

@frederikaalund
Copy link

Btw, I use click.Command and click.Option in cyto. Not the high-level, decorator-based API.

I chose click over the alternatives because click is popular (10.6K stars on GitHub). That gives an indirect promise of maintenance and stability. Also, because typer uses click under the hood, which is a solid stamp of approval in my book.

All that being said, I welcome anyone to have a go at an, e.g., cleo-based pydantic settings source. 👍 Maybe you can even re-use some elements from cyto. :)

@xkortex
Copy link

xkortex commented Apr 7, 2021

Ah, that's good to know that there's a non-decorator API for click. Similarly, cleo has a non-docstring API, but it's not super well documented (it might be there, but I aggressively skim read), I figured it out from basically looking at the poetry CLI. I've been using the following patterns:

$root/mymodule/app/opts.py:

from cleo.helpers import argument, option
flag_all = option("--all", "-a", "Do all the things (implementation dependent)")
volume = option("--volume", "-v", "Map paths based on bind mounts", flag=False, default=None, multiple=True)

...etc...

$root/mymodule/subcommand/core:

class BasicIoOptions(pydantic.BaseModel):
    input_uri: pydantic.FilePath
    output_dir: Optional[str]
    output_uri: Optional[str]
    config_uri: Optional[str]
    verbose: Optional[Union[str, bool]] = False
    dryrun: Optional[bool] = False

    class Config:
        extra = "allow"
        allow_mutation = False

$root/mymodule/app/cli.py:

from cleo import Command
from mymodule.app import opts
from mymodule.util.helpers import dash_to_underscore
from mymodule.subcommand.core import entrypoint, BasicIoOptions


class RunSubCmd(Command):
    name = "subcommand"
    description = """Run subcommand"""
    options = [
        opts.input_uri,
        opts.output_dir,
        opts.output_uri,
        opts.manifest_uri,
        opts.flag_dry,
    ]

    def handle(self):
        tmp = dash_to_underscore(self.option())
        args = BasicIoOptions.parse_obj(tmp)
        result = entrypoint(args)

I could probably get fancy and figure out how to get the pydantic Field along with the menu option, and perhaps even bundle the validators as well, so you get the Option with its Field and Schema, all in one place to doll out to Commands. But even just specifying the various options ahead of time in a reusable way and composing them into commands is a huge step from the amount of copy-paste I would have to do with argparse. But this approach definitely scratches the itch of "reuse options and commands throughout various subcommands and cast them into immutable strongly-typed data structures to give to entry points".

I'm definitely going to be looking through cyto when I have some spare time and seeing if I can apply that approach to make a pydantic-cleo plugin.

@berzi
Copy link

berzi commented Aug 21, 2022

I stumbled upon this thread as I was about to open a similar feature request. Consider this a vote to reopen.

My thinking is that just like BaseSettings Fields can look for a variable in the environment to populate an attribute with the env parameter, the same BaseSettings¹ could support Fields with a cli parameter to instruct it to look into sys.argv for the corresponding option.

The syntax I have in mind for Field(cli=) is a Litera[True], str or Iterable[str] like "--option" or "-o": the corresponding option name is looked for in sys.argv and if found, its value is put into the field. If True, the name of the field is converted to kebab-case and used as name for the option.

Personally I don't feel the need to support positional arguments as that would be more complex to design (I have a few ideas but none is perfect or as syntactically clear as I'd like).

Thoughts?

¹ It's still settings we're talking about here: no real need to put the functionality on a separate class, is there? Centralising settings seems to be one of the main goals of BaseSettings.

@samuelcolvin
Copy link
Member

For those interested, this has been resurrected in pydantic/pydantic-settings#209.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests