Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: Explore design trade offs #50

Merged
merged 1 commit into from
Sep 10, 2022
Merged

docs: Explore design trade offs #50

merged 1 commit into from
Sep 10, 2022

Conversation

epage
Copy link
Contributor

@epage epage commented Sep 9, 2022

This explores CLI design trade offs by comparing the very different approaches that clap and bpaf take. This can also help serve as a deeper comparison of these two crates to help people understand the trade offs they are making when selecting one of them which is why I decided to put this here (despite rosetta projects normally avoiding deep dives) instead of a blog post.

Since I've also received feedback in another rosetta project to give guidance rather than telling the user "you figure it out", I then use this as the basis for making a recommendation.

One section that I am still considering adding is on zero-dependencies.

cc @pacak

This explores CLI design trade offs by comparing the very different
approaches that `clap` and `bpaf` take.  This can also help serve as a
deeper comparison of these two crates to help people understand the
trade offs they are making when selecting one of them which is why I
decided to put this here (despite rosetta projects normally avoiding
deep dives) instead of a blog post.

Since I've also received feedback in another rosetta project to give
guidance rather than telling the user "you figure it out", I then use
this as the basis for making a recommendation.

One section that I am still considering adding is on zero-dependencies.
Copy link
Contributor

@pacak pacak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow, that is a great overview with most of the facts being fairly accurate.

But I went though your feedback on reddit and github, though parsing related tickets in clap repo so some things are about to change - mostly about the goals and about what bpaf can do. As you said both libraries are evolving.

Anyway, I think it's good to merge since it's pretty good at reflecting the current state, I'll make an update PR once the next version lands.

the `derive` API
- All other cases the answer is "it depends" as the two designs offer different
trade offs. If you want to just choose one with little research that will
cover the most use cases, that will most likely be `clap`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall I'd describe clap as a huge set of user friendly set of tools with each tool solving a single problem, while bpaf is a much smaller set of tools designed to be composed in yoda-speak manner to produce the actual tools you'd use for parsing. One requires more imagination to use, one requires more documentation research to use.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, we're trying to move more towards the middle, to be user friendly in the common cases but to provide building blocks for the less common cases so that (1) there is less overload in the documentation, (2) we compile faster, and (3) people can build the solutions they need without getting every feature baked directly in to clap.

Through v3 we had done some of the initial work in that direction which will be more obvious when I release v4 (wrapping up the documentation for the release).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, similar goal, approaching from different directions 👍

keys are the argument IDs and the values are effectively `Box<dyn Any>`.
- This allows dynamically created CLIs, like with
[clap_serde](https://docs.rs/clap_serde) or [conditionally-present
flags](https://github.com/sharkdp/bat/blob/6680f65e4b25b0f18c455f7a4639a96e97519dc5/src/bin/bat/clap_app.rs#L556)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting requirement, at the moment it's a bit wordy but I'm pretty sure we can get away with just going though a single element construct! for that... I'll check it in greater details once I get access to my computer. Either way after some mostly opt-in changes this should be accessible, users will have to implement only fn sneaky

struct DynParser<T> {
    inner: Box<dyn Parser<T>>
}

impl <T> Parser<T> for DynParser<T> {
    fn eval(&self, args: &mut Args) -> Result<T, Error> {
        self.inner.eval(args)
    }
    fn meta(&self) -> Meta {
    }
}

trait Parser<T> { 
....
    fn entomb(self) -> DynParser<T> {
        DynParser { inner: self }
    }
}

fn sneaky(present: bool) -> impl Parser<bool> {
    if sneaky {
        short('f').help("a sneaky switch").switch().entomb()
    } else {
        pure(false).entomb()
    }
}

Should be possible to do this in algebraic product context and similar but with fail in algebraic sum contexts.

fn sneaky(present: bool) -> impl Parser<bool> {
    if sneaky {
        let p = short('f').help("a sneaky switch").switch()
        construct!(p)
    } else {
        let p = pure(false);
        construct!(p)
    }
}

`clap`'s builder API stores parsed values in a Map-like container where the
keys are the argument IDs and the values are effectively `Box<dyn Any>`.
- This allows dynamically created CLIs, like with
[clap_serde](https://docs.rs/clap_serde) or [conditionally-present
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should be able to create a dynamic set of parsers (into a string keyed map with enums of bool/String/OsString), building blocks are:
1: function that maps incoming requirements into impl Parser<BtreeMap<String, Value>> (a boxed version, see below, so types are the same),
2: combining them using acc = construct!([acc, p])
3: consuming all the things into Vec<BTreeMap<String, Value>> with .many()
4: Flattening vector into a single Map (manually dealing what can be duplicated, etc)

This should mostly just work, probably even with completion support and validations (functions from step 1 and step 4 will have to implement that). That's a lot of manual work and in the end you'll get a string keyed map - having everything strongly typed and typo resistant is one of the main design goals for bpaf.

So it's probably a feature, not a bug.

conditioned on argument values (e.g. `default_value_if`). See the section on
Validation for more nuance on the "arbitrarily" part
- `clap` tries to help push as many errors to early test time with minimal
effort with the `cmd.debug_assert()` function. Value lookups still requires
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://docs.rs/bpaf/0.5.7/bpaf/struct.OptionParser.html#method.check_invariants

Tutorials mention it, maybe this can be improved further :)

And it's still work in progress, there's come upcoming changes to that so all the invariants bpaf relies on are enforced.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is mostly trying to contrast with bpaf's compile-time reporting. "Its not as early but it can still be relatively early" is the intent. Later I talk about how both provide asserting mechanisms

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, maybe should be reworded to mention bpaf here, to me it reads as a separate feature. But again, I'm jetlagged :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did a direct push contrasting it with compile time errors.

the different cases upfront, including some usability features to help catch
developer mistakes.

`bpaf` instead does context-free tokenization, storing all flags and values in
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is correct as of released version, there's a WIP branch (not pushed) that allows you to restrict the context parser runs (a single rule - arguments needs to be adjacent to each other) but it opens a whole bunch of options: flags with multiple arguments (--foo one two three), fd --exec rm -rf, clap-rs/clap#2222 (not limited to primitive commands, subcommands, flags, etc are supported, the ticket you gave me about "argument structs" but with proper non-interleaving protection, probably a bunch more. Need some time to clean it up though (and my computer)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tokenizer still produces a list - not having it being coupled with the parser makes things easier, decision to how to consume it is made during parsing specifically. As a side effect this allows to use -o as both boolean switch and an argument for -o foo.rs/-ofoo.rs, and only when the usage is ambiguous it would complain to user to rephrase it: -ov with both -o and -v flags and -o argument present => ask user to specify either -o -v or -o=v.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, I figured most of these were more fundamental to your approach and that all of this would stay the same but it sounds like a lot will need to be thrown out :)

Good job on that branch!

`--`, not allowing cases like `fd --exec rm -rf ;`
- Short flags can only have attached values in the `-a=b` case (not even `-ab=c`)
- An argument is resolved as either a subcommand, positional value, or a flag's
value based on the order that the they are processed (as determined by their
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More or less, unless you add context restrictions - still depends on the order, but also on a context - like this can be parsed: clap-rs/clap#1210 (would probably look butt-ugly in help though)

`bpaf` uses function/macro combinators to declare argument / group of argument relationships. This
offers a lot of flexibility that, again, you only pay for what you use.

The downside to this approach is it couples together:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure I agree that some of those are downsides but yea, that's a pretty accurate description.


Both libraries provide derive APIs that mask over the static and dynamic typing differences.

In `bpaf`s case, the combinators still show through in terms of requiring the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the combinators still show through

By design, I'm trying to keep derive and combinatorial APIs as close as possible together so if you know one - you can use the other and mixing them is often more convenient than either one separately.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The concern isn't with seeing the combinators but with the fact that your struct layout might not match your business logic to meet the needs of the combinators.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have cargo-hackerman and cargo-show-asm from public projects and some private ones, plus I went though how people tend to use the library - I don't think I saw any examples where things enforced by combinators don't agree with the business logic. Like if you want to parse some cargo options you probably want to parse it the same way in multiple subcommands and consume it as a whole in one place without passing the whole set of options. About 20-30 projects maybe? Would be great to see any examples where this isn't the case though. Will give some food for thoughts. Addng extra grouping to sort/validate stuff is easy: .group(Help::Foo).group(Meta::Bar).group(Validate::OneOf("baz")), etc. I just don't want to add any new primitives unless there's actual need for them - complexity, compilation time, makes design iterations harder.

varying backgrounds solving different problems and `clap` has taken input and
adapted to help meet a variety of needs.

`bpaf` is a younger project but it is able to move a lot more quickly because
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think at the moment I'd describe the requirements as parse pretty much anything created from opengroup base spec-looking things as base items and provide enough primitive building blocks to satisfy any/most of the requirements. Plus some common building blocks will go into batteries module - something that only uses public API and shows up often enough.

I went though open A-parsing tickets in clap repo and tickets go into one of three categories:

  • I'm not sure what the ticket is about (probably needs more careful reading)
  • ticket asks for non opengroup-like thing (bash +O shopt)
  • this can be done with existing combinators (plus unreleased adjecent, catch and leftovers).

Maybe I should go though closed tickets for more interesting examples, again - once I get my computer back and polish/publish current WIP stuff.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what the ticket is about (probably needs more careful reading)

Feel free to ping me

ticket asks for non opengroup-like thing (bash +O shopt)

For clap, we are looking at solving this in two ways

  • We've split out the lexer and are looking to make it pluggable
  • We are opening up the argument ids (for us they are strings) so that a custom lexer could include the prefix and the user would then include that prefix in the argument id to get a match

Maybe I should go though closed tickets for more interesting examples, again - once I get my computer back and polish/publish current WIP stuff.

I'd recommend also branching out to see what other situations are.

Also, give a try emulating some more complex use cases, like more of git (diff's weird use of -- for "always last argument, stash's default subcommands, etc) or cargo with cargo run (can't think of other cases).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For non-opengroup things - I think it's going to be a user facing preprocessor - it should also cover some more strange usercases but keep the help generation in terms of processed names, so +O is accepted but help would be talking about -O unless there's some custom overrides for that which sould be already possible.

Will take a look at git, thanks for the tip. As for cargo run - I think I already implemented it as a part of unreleased https://github.com/pacak/bpaf/tree/master/bpaf_cauwugo - this gets dynamic completion available for workspace binaries/examples without having to fight cargo and mess with manually installing completions for all the things. Plus some other convenience things. But that's still some time away.


`bpaf` instead does context-free tokenization, storing all flags and values in
a list.
- Tokens that look like flags can only be considered flags, unless escaped with
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

req_flag to parse --exec, many + primitive used by leftovers (heh, need to export one more primitive) + catch to consume rm -rf and positional + guard to consume ;. Everything sitting inside adjacent, + map to focus on the vector. this would give a Vec<String> or Vec<OsString> depending if .os() is present on the leftovers's primitive or not.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm... Hey, leftovers is that primitive + .many, not adding any more new primitives then :)

I love rubber duck debugging.

@epage
Copy link
Contributor Author

epage commented Sep 10, 2022

@pacak If its ok, rather than filtering through what is for the future and what is a correction for now, I'm going to go ahead and merge as-is. Feel free to create any immediate PRs and then please let me know when its ready and we can post this on reddit.

@epage epage merged commit 5419d70 into rosetta-rs:main Sep 10, 2022
@epage epage deleted the trade branch September 10, 2022 01:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants