Skip to content
This repository has been archived by the owner on Jan 1, 2022. It is now read-only.

clap_derive: Vec/Option<Vec> behavior is inconsistent with other types #148

Open
epage opened this issue Dec 6, 2021 · 28 comments
Open

Comments

@epage
Copy link
Owner

epage commented Dec 6, 2021

Issue by CreepySkeleton
Tuesday Mar 31, 2020 at 12:47 GMT
Originally opened as clap-rs/clap#1772


This is actually a summary of TeXitoi/structopt#364

Current behavior

  • Vec<T> and Option<Vec<T>> are not required = true by default.
  • Vec<T> is multiple = true by default which allows not only multiple values (--foo 1 2 3) but also multiple occurrences (--foo 1 --foo 2 3).
  • Option<Vec<T>> additionally allows zero number of values (--foo).

What's wrong

  • The fact that Vec<T> is not required by default is inconsistent with all the other types in clap_derive that are required by default unless they are wrapped in Option (except bool but it's a very special case).
  • The fact that Option<Vec<T>> is different from Vec<T>, different not in "not required" sense, confuses newcomers.
  • The fact that Vec<T> allows multiple occurrences along with values is misleading.

Proposal

  • Use min_values = 1 for both Option<Vec<T>> and Vec<T> instead of multiple = true, allowing only non-zero number of values and disallow multiple occurrences (--foo 1 2 but not --foo nor --foo 1 --foo 2). If a user wants to allow zero values or multiple occurrences as well, they can explicitly specify it via min_values = 0 and multiple = true respectively.
  • Use required = true for Vec<T>.

cc @TeXitoi @Dylan-DPC @pksunkara

@epage
Copy link
Owner Author

epage commented Dec 6, 2021

Comment by TeXitoi
Tuesday Mar 31, 2020 at 13:25 GMT


The fact that Vec is not required by default is inconsistent with all the other types in clap_derive that are required by default unless they are wrapped in Option (except bool but it's a very special case).

I strongly disagree on this. A vector can have 0 elements, and that's the most used case. If you need at least one, you can add min_values(1) yourself.

@epage
Copy link
Owner Author

epage commented Dec 6, 2021

Comment by CreepySkeleton
Tuesday Mar 31, 2020 at 13:56 GMT


If we use min_values(0), an empty vec would represent --foo <no values>, but "no --foo" would be disallowed due to required.

Option<Vec<T>> would represent just what it does currently:

command Vec<T> result Option<Vec<T>> result
app --foo 1 2 [1, 2] Some([1, 2])
app --foo [] Some([])
app ERRROR None

@epage
Copy link
Owner Author

epage commented Dec 6, 2021

Comment by pksunkara
Sunday Apr 12, 2020 at 22:19 GMT


The min_values and occurrences issue should be fixed in #1026. Let's leave it out of the discussion here.

The fact that Option<Vec> is different from Vec, different not in "not required" sense, confuses newcomers.

Can you explain what this means? I don't understand the sentence.

I think we should have min_values = 0 and multiple = true for both but required = true for Vec<T>.

@epage
Copy link
Owner Author

epage commented Dec 6, 2021

Comment by TeXitoi
Monday Apr 13, 2020 at 08:04 GMT


I really disagree. Most of the time you're using a Vec, you just want the list of parameters, and you don't really care if an empty parameter was provided. Forcing using an option of vec in the most common case would impact a lot the ergonomics.

@epage
Copy link
Owner Author

epage commented Dec 6, 2021

Comment by cecton
Friday Apr 17, 2020 at 07:11 GMT


Here is a use case where I find structopt very confusing: paritytech/substrate@6b0eed4

When I read the struct I see that listen_addr is a required argument and port is optional. But the structopt parameters shows me that both arguments are conflicting. Before that commit, the parameters weren't declared as conflicting and I had no way to know that listen_addr was actually optional. The only way I could know that is by reading the table.

I personally do think that Vec<_> should mean required (the argument must be provided) and Option<Vec<_>> should mean optional (the argument can be omitted).

I invite you to read the lengthy discussion here.

Forcing using an option of vec in the most common case would impact a lot the ergonomics.

Maybe but it will be consistent with the rest of the API.

@epage
Copy link
Owner Author

epage commented Dec 6, 2021

Comment by CreepySkeleton
Saturday Apr 18, 2020 at 10:06 GMT


Pff, it took me some time to get back to, sorry for the delays guys.

The min_values and occurrences issue should be fixed in #1026. Let's leave it out of the discussion here.

That issue is related here indeed, but we can't leave it out of this discussion because this is about defaults in derive, not about distinguishing between the two.

Forcing using an option of vec in the most common case would impact a lot the ergonomics.

@TeXitoi I see your point about Option<Vec<T>> vs Vec<T> ergonomics. I agree with you, simple Vec is much more handy than Option<Vec<T>> to work with.

But we wouldn't be forcing anything! Users would still be very much able to do #[clap(required = false)] if they want to keep a raw Vec

struct Opts {
    // This Vec is not option so it's required BY DEFAULT
    // because it's not bool and not Option
    required_vec: Vec<u32>,

    // We can override default behavior 
    // while keeping the raw Vec 
    #[clap(required = false)]
    not_required_vec: Vec<u32>,

    // We can use Option<Vec> instead of overriding
    // We lose some of the ergonomics but acquire the 
    // ability to distinguish between <empty list> an <no option at all>
    opt_vec: Option<Vec<u32>> 
}

In other words, this is question of good defaults, and I believe I've found the answer.

I think that our willingness of changing defaults should depend of what the most widespread usage is in practice. I decided to check via grep.app and this has been eye opening for me:

(IMPORTANT: this research assumes that #[structopt] attributes are one-line which is is quite frequently not true. If somebody has an idea how to cover

#[structopt(
 method = name,
 method2 = expr, method3 = expr)

attributes with regexes, speak up!)

  • required Vec or not.


    Conclusion: current default is exactly what most of our users need (~85%). Changing it would be unwise.
  • min_values = 0 vs min_values = 1 (behavior of multiple = true)

    • I searched "contains min_values" and there are only 13 results. None of them are min_values = 0. A little bonus: almost all of them are min_values = 1 which is the default behavior, lol.

    Conclusion: nobody wants `min_values = 0`. Making it default would be pointless.
  • Multiple values vs multiple occurrences. Default behavior for now is both. This is kind of thing that is hard to grep for, so I tried to figure out how many users are aware of multiple occurrences in the first place:

    • Contains number_of_values - 11 results.
    • Contains max_values: 1 result.
    • Contains multiple = true - 11 matches.


      This is all I've managed to come up with. Looks like there are not many people out there who are aware or want multiple occurrences, but the results here are probably very inaccurate. I vote for using multiple_values = true and ask the community after the beta is out.

@epage
Copy link
Owner Author

epage commented Dec 6, 2021

Comment by pksunkara
Saturday Apr 18, 2020 at 11:00 GMT


I think that our willingness of changing defaults should depend of what the most widespread usage is in practice.

I don't agree with this. Maybe all of them tried Option<Vec<T>> first and then decided that it didn't make a difference so they optimised it back to Vec<T>.

Also, we are not technically breaking this because this a separate library clap and not structopt. I would opt for it easier to understand w.r.t semantics in how to use the types for the fields and being different with each other rather than existing usage.

min_values should always be 0. As you saw, people saw Vec and with semantics they thought min_values might have been zero and that is why they customised the behaviour to be min_values = 1. (Related to #1682)

Note: multiple means multiple occurrences not values according to clap docs

Type required multiple
Vec<T> true false
Option<Vec<T>> false false
Vec<Vec<T>> true true
Option<Vec<Vec<T>>> false true
command Vec<T> Option<Vec<T>> Vec<Vec<T>> Option<Vec<Vec<T>>>
app -f 1 2 [1, 2] Some([1, 2]) [[1, 2]] Some([[1, 2]])
app -f [] Some([]) [[]] Some([[]])
app ERRROR None ERROR None
app -f 1 2 -f 3 4 [3, 4] Some([3, 4]) [[1, 2], [3, 4]] Some([[1, 2], [3, 4]])
app -ff [] Some([]) [[], []] Some([[], []])

By default, if one wants to support a -vvv kind of flag, they would have to use Option<Vec<Vec<bool>>>. But they always have the option of customizing stuff just by describing it using methods like how you already proposed.

#[clap(short, multiple = true, required = false, max_values = 0)]
verbose: Vec<bool>

@epage
Copy link
Owner Author

epage commented Dec 6, 2021

Comment by CreepySkeleton
Saturday Apr 18, 2020 at 12:45 GMT


Maybe all of them tried Option<Vec> first and then decided that it didn't make a difference so they optimised it back to Vec.

Sounds very unlikely, but maybe. @TeXitoi , has anybody ever told you something like "I tried Option<Vec> first, but settled with Vec because it's handy"? I remember only one such message: TeXitoi/structopt#285 .

Also, we are not technically breaking this because this a separate library clap and not structopt

I agree, absolutely. But we aren't talking just about breaking changes, we're talking about good defaults and whether it's worth to change them. If we find that the current defaults are not optimal, we shall change them, otherwise we shall leave them as is.

As you saw, people saw Vec and with semantics they thought min_values might have been zero and that is why they customised the behaviour to be min_values = 1

Eleven (min_values = 1) people out of 214 (total number of Vecs discovered). Very few people have been confused, so min_values = 1 default is preferable for Vec and min_values = 0 is preferable for Option<Vec>. In my opinion.


I like your Vec<Vec<T>> idea. We need to fix clap-rs/clap#1026 first though. Also, in the light of clap-rs/clap#1682, we may also specialize Vec<[T; N]> (arrays) and Vec<(T1, T2, ...)> (tuples) along with the Option<...> variants of them.

By default, if one wants to support a -vvv kind of flag, they would have to use Option<Vec<Vec>>

This is exactly what parse(from_occurrences) is for, isn't it?

@epage
Copy link
Owner Author

epage commented Dec 6, 2021

Comment by CreepySkeleton
Saturday Apr 18, 2020 at 12:46 GMT


Note: multiple means multiple occurrences not values according to clap docs

This part of reply ended up being literally drenched in sarcasm, an therefore was moved to it's own comment. I'm not blaming anyone and neither I mean to offend anybody. Just stating that the state of affairs is ridiculous.

Sure thing. According to 2.x docs. Well, let's test it! Playground

extern crate clap;

use clap::{App, Arg};

fn main() {
    let m = App::new("app")
        .arg(Arg::with_name("arg").multiple(true).takes_value(true))
        .arg(Arg::with_name("opt").long("foo").multiple(true).takes_value(true))
        .get_matches_from(&["test", "val1", "val2", "--foo", "optv1", "optv2"]);
        
    println!("{:?}", m.values_of("arg").unwrap().collect::<Vec<_>>());
    println!("{:?}", m.values_of("opt").unwrap().collect::<Vec<_>>());
}

So, we expect it to error, right?

["val1", "val2"]
["optv1", "optv2"]

Wait, what?

In practice, I've seen a lot of people taking multiple as synonym for "multiple values". And this is how it does work: multiple values + multiple occurrences. This is the way it's been working for, how long? Five years? People got used to it, it's in their DNA now. We can't just change the behavior (while preserving the name of the method) and expect it to be received well. And we didn't.

Well, the master is much better! Right? Right. You'd expect it to be documented properly...

Specifies that the argument may have an unknown number of multiple values. Without any other
settings, this argument may appear only *once*.

For example, `--opt val1 val2` is allowed, but `--opt val1 val2 --opt val3` is not.

Of course. "May appear only once". Crystal clear. What the function looks like?

pub fn multiple(mut self, multi: bool) -> Self {
    if multi {
        self.setb(ArgSettings::MultipleOccurrences); // <- YOU MUST BE KIDDING
        self.setting(ArgSettings::MultipleValues)
    } else {
        self.unsetb(ArgSettings::MultipleOccurrences);
        self.unset_setting(ArgSettings::MultipleValues)
    }
}

Sarcasm aside, I very much like how it's done in master:

// multiple values + multiple occurrences
pub fn multiple(mut self, multi: bool) -> Self;

// multiple values only
pub fn multiple_values(self, multi: bool) -> Self;

// multiple occurrences only
pub fn multiple_occurrences(self, multi: bool) -> Self;

Well done. Well done indeed. No sarcasm, the API is clear and tidy. Except copy-pasting docs and examples from 2.x probably wasn't the brightest idea.

@epage
Copy link
Owner Author

epage commented Dec 6, 2021

Comment by TeXitoi
Saturday Apr 18, 2020 at 12:53 GMT


Option<Vec> is a very recent feature. No one asked anything about Option<Vec> for several years.

I've never seen any question about Option<Vec> except from @cecton as far as I remember. Just one but report saying that Option<Vec> was a breaking change because someone was using Option<Vec<u8>>.

@epage
Copy link
Owner Author

epage commented Dec 6, 2021

Comment by CreepySkeleton
Saturday Apr 18, 2020 at 13:09 GMT


Also, the last like of @pksunkara 's table is impossible unless user explicitly specifies takes_value = false. All of the variants should be takes_value = true by default (multiple_values implies it).

app -ff [] Some([]) [[], []] Some([[], []])

@epage
Copy link
Owner Author

epage commented Dec 6, 2021

Comment by pksunkara
Saturday Apr 18, 2020 at 13:56 GMT


In that case, assume that I meant multiple_occurrences only when I said multiple in my earlier comment.

What do you mean by takes_value = true by default? Does all the types we mentioned above has that as default? Then doesn't it mean that app -f is not allowed too? Vec<T> would never be allowed 0 values.

@epage
Copy link
Owner Author

epage commented Dec 6, 2021

Comment by CreepySkeleton
Saturday Apr 18, 2020 at 15:25 GMT


Does all the types we mentioned above has that as default?

In clap 2: it does so explicitly, this is documented:

Setting multiple(true) for an option with no other details, allows multiple values and multiple occurrences because it isn't possible to have more occurrences than values for options

In clap 3: not explicitly, I haven't been able to locate the place this is handled in, but I just checked this program:

use clap::{App, Arg};

fn main() {
    let m = App::new("app")
        .arg(Arg::with_name("opt").long("foo").multiple_values(true))
        .get_matches();

    println!("{:?}", m.values_of("opt").map(|v| v.collect::<Vec<_>>()));
}

And yes, it works just like if takes_value(true) has been set:

D:\workspace\probe>cargo run --
    Finished dev [unoptimized + debuginfo] target(s) in 0.13s
     Running `target\debug\probe.exe`
None

D:\workspace\probe>cargo run -- --foo val1
    Finished dev [unoptimized + debuginfo] target(s) in 0.14s
     Running `target\debug\probe.exe --foo val1`
Some(["val1"])

D:\workspace\probe>cargo run -- --foo val1 val2
    Finished dev [unoptimized + debuginfo] target(s) in 0.13s
     Running `target\debug\probe.exe --foo val1 val2`
Some(["val1", "val2"])

app -fff from the table would have been parsed as (also checked with master)

app -f ff
    -- --
  flag value

@epage
Copy link
Owner Author

epage commented Dec 6, 2021

Comment by pksunkara
Wednesday Apr 29, 2020 at 08:41 GMT


@kbknapp Since we have you here, please take a look at this issue.

@epage
Copy link
Owner Author

epage commented Dec 6, 2021

Comment by CreepySkeleton
Wednesday Apr 29, 2020 at 12:26 GMT


Actually, if you wrap your head around the terminology a bit, it becomes quite clear:

  • Non-special types are required by default. (For instance, String is not special)
  • bool is an example of special type, it's not required by default,
  • Vec, Option, arrays, tuples and such can be considered "adapter" for other types. Precise behavior depends on adapter and the type it wraps.

It's a problem of good documentation more than anything, about the way it conveys this setup to user. I'm amending my previous statement about required with Vec, I see no problem with it anymore. Points about number_of_values and multiple_occurrences vs multiple_values still stand.

@epage
Copy link
Owner Author

epage commented Dec 6, 2021

Comment by kbknapp
Wednesday Apr 29, 2020 at 16:22 GMT


Whew ok that was a read!

Ever time I see issues pop up around multiple I cringe a little inside. In clap 1 there was only the concept of multipe = true, and I only ever really cared about -f v v -f v v (i.e. both occurrences and values) for no other reason than having not put a lot of thought into it.

Then clap 2 came around and there was confusion when people wanted multiple values, but not multiple occurrences, or vice versa. It was also over-loaded with flags which only occurrences are possible. This spawned things like number_of_values or min_values/max_values. Again, it was super confusing for the subset of people that had these various requirements, but most just adapted and moved on.

Looking back, I wish I'd made the sane default of multiple=true and no other info being -f v -f v is allowed -f v v is not. Which is what I attempted to make the default in 3.x because this was by far the most common request, most people expected -f v -f v (minus people mostly familiar with IRC like CLIs which was a minority). It also no longer required the number_of_values=1, which in all honest looks kind of confusing if you're not already familiar with clap...."Why am I say I want multiple values, AND also saying number of values is 1?!"

I also wanted to use the 3.x breaking changes to fix this confusion once and for all. I have a much better idea of what is most common. I want the defaults to be the most common case, and allow those who need something different be able to do so.

Unfortunately before I went on hiatus I didn't finish the docs which I'm sure has lead to even more confusion around the various types of multiples, and their defaults or interactions with each other.

I see multiples as one of those things, "It sounds sooo simple at first, why can't it just work!?" Until you dig into the details and realize, "Ooooh. Yeah I can totally see how some would want X while others need Y, and you need to distinguish between them sometimes, but not always." It's kind of like Strings in that manner; such a simple concept yet so hard to do correctly. ...Or CLIs for that matter 😜

Ok, so back to the topic at hand:

I find myself agreeing with @CreepySkeleton here. We're talking about sane defaults. I can also totally respect @TeXitoi's sentiment of Option<Vec<T>> is much more of a pain to deal with than Vec<T>. However I disagree that all options should default to 0 values. In fact, I've come across very few CLIs that rely on --foo <no value>. They do exist for sure, but most I've run into either expect a value, or expect no argument to be used (with or without a default value).

So I think this comes down to balancing Rust ergonomics and semantics (Vec<T>, where all Vecs can have zero values by default) with clap ergonomics and semantics (Option<T> means optional argument, thus Option<Vec<T>> means optional option which accepts multiple values...combined with in clap speak all options by default must have at least 1 value). I think it's perfectly reasonable to impose clap defaults/semantics here. People are building a struct specifically around claps semantics, so I think that although Vec<T> can totally have zero or more values by default the fact that it does not lining up with claps 1 or more values by default is fine and people will first look to their knowledge about clap semantics when building these structs. This is in part because there is no intrinsic Rust type that is one or more values.

I would suggest simply calling it out in the docs directly, that although the default is Vec<T> means required = true, if you want the ergonomics of Vec<T> just override the required attribute with false

Also, perhaps I'm out of touch, but if I wanted to allow multiple values with occurrences via derive in clap I'd instinctivily try to do Option<Vec<Vec<T>>>

command type
-f v -f v Some([v, v])
-f v v Some([v, v])
-f v v -f v v Some([[v, v], [v, v]])

app -fff from the table would have been parsed as [-f=ff]

Correct. That is intended behaviour due to how parsing is structured. I don't think there is anything we can, or would want to do about this.

It's a problem of good documentation more than anything, about the way it conveys this setup to user.

100% agree.


Having said all that, if we settled on Vec<T> defaulting to not-required because Rust ergonomics are more important, it's not a huge issue to me so long as it's documented very clearly.

@epage
Copy link
Owner Author

epage commented Dec 6, 2021

Comment by sourcefrog
Thursday Jun 04, 2020 at 18:49 GMT


I just came here from TeXitoi/structopt#396 and wanted to say, I really hope in a v3 breaking change, Clap can address the multiple / number_of_values confusion, both for the API and for the new derive macros.

Looking back, I wish I'd made the sane default of multiple=true and no other info being -f v -f v is allowed -f v v is not. Which is what I attempted to make the default in 3.x because this was by far the most common request, most people expected -f v -f v (minus people mostly familiar with IRC like CLIs which was a minority). It also no longer required the number_of_values=1, which in all honest looks kind of confusing if you're not already familiar with clap...."Why am I say I want multiple values, AND also saying number of values is 1?!"

+100 this bit. It really sticks out as a trap in what's generally a friendly API, and is exacerbated by the fact that you can easily not notice the bug is there.

@epage
Copy link
Owner Author

epage commented Dec 6, 2021

Comment by epage
Friday Jul 09, 2021 at 20:04 GMT


Catching up on this issue. Sorry if I missed something or am just repeating others.

Multiple values: From a code author and user perspective, I have found variable number of values (-f 1 and -f 1 2) to be confusing and brittle. In the spirit of "pit of success" (make the right things easy), this should require extra work to opt-in to. I'd go as far as saying that the user should always be required to specify an attribute for this

Delimited values: When I want to support multiple values, I tend to do it with a delimiter (-f 1,2 or Rust's --features "foo bar"). If Rust support &'static str in const-generics, we could easily provide a DelimitedValues container for users that implements FromStr, IntoIterator, etc.

When I'm writing a CLI, I tend to prefer multiple occurrences with delimited values all flattened, There tends to be some friction in this process, so I tend to only do it when I really need it. When I do skip on one, I skip on delimited values, preferring multiple occurrences. It tends to be more composable. If I need require at least one value, it is across both occurrences and delimited values.

The fact that Vec is not required by default is inconsistent with all the other types in clap_derive that are required by default unless they are wrapped in Option (except bool but it's a very special case).

So far, the talk seems to be about Option being the way of specifying required and anything else is a deviation provided for ergonomics (Vec, bool).

Something being left out is alternative mental models, like focusing on the type's semantics, for which Option and Vecs behavior could be made consistent.

So let's break down some type semantics:

  • i32: single value (N)
  • Option<i32>: 0 or 1 values (?)
  • Vec<i32>: 0 or more values (*)
  • (i32, i32): exactly two values (N) (never seen this documented but we'll talk about it later)

(we don't have a native type for "1 or more" (+), though there is the vec1 crate and ranged number of values ({M,N}) is too specialized to worry about)

And with the notes at the top, our considerations are:

  • Variable number of values is brittle and we should at least not make too easy, maybe even make it difficult.
  • It takes extra work to customize the default parsing logic (parse attribute) for natively supported types
  • I'm not aware of a way to plug in a type that behaves like Option, Vec, or bool

My proposal would be to match the type semantics, focusing on occurrences:

  • Option<i32>: 0 or 1 occurrence
  • Vec<i32>: 0 or more occurrences
  • i32: 1 occurrence
  • (i32, i32): multiple values (fixed number), this is a weird one because fixed number of occurrences is rare enough not to provide but fixed number of values seems common enough. Python's argparse goes as far as automatically handling this with type=(str, str) and metavar=("NAME", "VALUE") when you say nargs=2.
  • attribute: multiple values (variable number)

Note: one downside to implementing any intuitive behavior is people have to predict both what was intuitive to you and how complete your implementation is. No-intuitive behavior is fully explicit and predictable. A theoretical fully-intuitive system would be, by definition, predictable. When you land in the middle, people feel some uncertainty. How quickly that uncertainty is dispelled is proportionate to the quality of your library(1) in framing (API design, docs explaining) (2) making it quick to answer questions (reference). As an aside, I think structopt has let me down on both of these fronts and hopefully we'll iterate and improve on this. This doesn't mean we should never try to make things intuitive, we should just avoid cleverness and recognize two reasonable people will have different opinions on where things land with this.

@epage
Copy link
Owner Author

epage commented Dec 6, 2021

Comment by pksunkara
Saturday Jul 10, 2021 at 22:30 GMT


Tuples and other adapters can be handled/discussed in clap-rs/clap#1717. What we need to finalize in this issue are the Option and Vec combination adapters.

@epage
Copy link
Owner Author

epage commented Dec 6, 2021

Comment by epage
Saturday Jul 10, 2021 at 23:24 GMT


To clarify, my intent was not to pull #1717 into this but to look at the problem holistically to see how it can affect decisions made here.

@epage
Copy link
Owner Author

epage commented Dec 6, 2021

Comment by pksunkara
Saturday Jul 10, 2021 at 23:26 GMT


I think we need to first decide on a table like https://docs.rs/structopt/0.3.22/structopt/#type-magic

@epage
Copy link
Owner Author

epage commented Dec 6, 2021

Comment by epage
Thursday Jul 22, 2021 at 15:29 GMT


My hope is to look at the problem holistically, defining principles for us to
apply to clap_derive, and then the work will be split up among the respective Issues.

Background

  • #1772: Confusion over Option vs Vec
  • #2599: Confusion over multiple_occurrences with bool
    • Current solution is to use #[clap(parse(from_occurrences)]
    • Issue Suggestion: Option<bool> / Vec<bool>
  • #1717: Support tuples in derive

structopt:

Type Added method call to clap::Arg
bool .takes_value(false).multiple(false)
Option<T: FromStr> .takes_value(true) .multiple(false)
Option<Option<T: FromStr>> .takes_value(true) .multiple(false).min_values(0).max_values(1)
Option<Vec<T: FromStr> .takes_value(true) .multiple(true).min_values(0)
Vec<T: FromStr> .takes_value(true) .multiple(true)
T: FromStr .takes_value(true) .multiple(false).required(!has_default)

Definitions:

  • multiple: Specifies that the argument may appear more than once.
    • Setting multiple(true) for an option with no other details, allows multiple values and multiple occurrences because it isn't possible to have more occurrences than values for options.

clap_derive post-3.0.0-beta.2

Type Added method call to clap::Arg
bool (default)
Option<T: FromStr> .takes_value(true)
Option<Option<T: FromStr>> .takes_value(true).multiple_values(false).min_values(0).max_values(1)
Option<Vec<T: FromStr> .takes_value(true).multiple_values(true) .min_values(0)
Vec<T: FromStr> .takes_value(true).multiple_values(true)
T: FromStr .takes_value(true) .required(!has_default)

Definitions

  • multiple_values: Specifies that the argument may have an unknown number of multiple values.
    • Without any other settings, this argument may appear only once.
  • multiple_occurrences: Specifies that the argument may appear more than once.
    • For example, --opt val1 --opt val2 is allowed, but --opt val1 val2 is not.

History

  • multiple_occurrences was introduced in 6fc70d8
  • multiple() was removed in Removed Arg::multiple clap-rs/clap#2538
    • It set both multiple_occurrences and multiple_values
    • WARNING: Examples went to a mixture of multiple_values vs multiple_occurrences, are they correct / best practices?
  • 48f83e0 got rid of the extra builder methods when mapping types to arg settings

Potential Considerations

Brevity: what gets the shortest code

Obvious code: is the intent clear?

Obvious API: can I predict how the API will behave (i.e. how much do I need docs?)?

  • Orthogonality / Composability: are there a few key principles that other cases are derived from?
  • Predictability: Can I look at one case and know how to apply it to others?

Pit of Success: is the path of least resistance following best practices? Are problematic practices discouraged through resistance?

Features: Are all supported features possible to use?

Toolbox

Types

  • bool
  • Option (0 or 1 values)
  • Vec (0.. values)
  • vec1 crate (1.. values)
  • Tuple (N values, various types)

Features

  • Grouped values
  • Values
  • Occurrences

clap_derive (proposed)

Type Effect Added method call to clap::Arg
bool true if the flag is present (default)
Option<T: FromStr> 0 or 1 occurrences .takes_value(true)
Vec<T: FromStr> 0+ occurrences .takes_value(true).multiple_occurrences(true)
T: FromStr 1 occurrence .takes_value(true) .required(!has_default)
(A: FromStr, B: FromStr, ...) N values .takes_value(true).number_of_values(N).required(!has_default)

Notes

  • Tuples: technically usable with Option and Vec, just not listed

Motivation:

  • multiple_occurrences is assumed to be what people expect by default
  • multiple_values without number_of_values or require_delimiter is brittle and should only be
    used with intention, knowing of risks
    • Clap's documentation warns about
      • unintended side effects with sub-commands
      • unintended side effects with positional arguments
    • Additionally, there can be ambiguity with values that start with -
  • Vec<Vec<_>>, Option<Vec<_>>, and Option<Option<_>>> don't compose (you can't use them separately to get the same individual meanings)
  • Vec<_> already conveys optional to rust developers
    • We can define consistency by "this ArgSetting maps directly to this flag"
      or by "this Rust type implies this behavior". One is focused on
      implementation which is more opaque to users while the other is focused on
      the user's view
  • While there has been some confusion over structopt and clap_derives behavior, I'd posit that most of it stems from involving multiple_values, and with that gone, there will be less.

Potential future improvement: I have some hand-wayy ideas for how we can improve the
experience by allowing derives to define built-in or user-provided "actions",
see clap-rs/clap#2615

@epage
Copy link
Owner Author

epage commented Dec 6, 2021

Comment by pksunkara
Friday Aug 13, 2021 at 00:06 GMT


Ignoring the tuples part of the discussion for now.

Vec<Vec<>>, Option<Vec<>>, and Option<Option<_>>> don't compose (you can't use them separately to get the same individual meanings)

I am afraid I don't understand what you meant here.

Vec<_> already conveys optional to rust developers

I don't agree. While vec does convey that it can be empty, that's not exactly same semantics here. Here, when you mean optional, the semantic is whether vec should be present in the first place.

@epage
Copy link
Owner Author

epage commented Dec 6, 2021

Comment by epage
Friday Aug 13, 2021 at 16:55 GMT


Vec<Vec<>>, Option<Vec<>>, and Option<Option<_>>> don't compose (you can't use them separately to get the same individual meanings)

I am afraid I don't understand what you meant here.

Vec is taking on two meanings, multiple values and multiple occurrences, depending on the level of embedding. A user cannot choose one over the other. Instead of the API providing basic building blocks, each with their own meaning, that the use can put together as they wish that makes on-boarding easy, it instead has contextual meanings based on how the building blocks are used that we have to explain and the user has to grok, whether they use our docs or not.

Vec<_> already conveys optional to rust developers

I don't agree. While vec does convey that it can be empty, that's not exactly same semantics here. Here, when you mean optional, the semantic is whether vec should be present in the first place.

I'm not quite following this logic.

@epage
Copy link
Owner Author

epage commented Dec 6, 2021

Comment by pksunkara
Saturday Aug 14, 2021 at 01:13 GMT


Vec<_> already conveys optional to rust developers

I'm not quite following this logic.

Yes, it conveys that it can be empty. But that's the semantics for the values of the arg. It doesn't say anything about whether the arg is required or optional. (app vs app -f vs app -f val). Which is why we need Option<Vec<_>>

@epage
Copy link
Owner Author

epage commented Dec 6, 2021

Comment by pksunkara
Saturday Aug 14, 2021 at 01:14 GMT


Vec is taking on two meanings

I agree, but that is what I want us to resolve here. Provide easy way to do Vec<Vec<_>> with good defaults.

@epage
Copy link
Owner Author

epage commented Dec 6, 2021

Comment by epage
Wednesday Oct 06, 2021 at 17:17 GMT


Since I found some of the original structopt PRs, thought I'd note them here in case we can mine useful historical information

@epage
Copy link
Owner Author

epage commented Dec 6, 2021

Comment by epage
Friday Nov 05, 2021 at 14:24 GMT


Before:

  • bool: a flag
  • Option<_>: not required
  • Option<Option<_>> is not required and when it is present, the value
    is not required
  • Vec<_>: multiple values, optional
  • Option<Vec<_>>: multiple values, min values of 0, optional

After:

  • bool: a flag
  • Option<_>: not required
  • Option<Option<_>> is not required and when it is present, the value
    is not required
  • Vec<_>: multiple occurrences, optional
    • optional: Vec implies 0 or more, so should not imply required
  • Option<Vec<_>>: multiple occurrences, optional
    • optional: Use over Vec to detect when no option being present when
      using multiple values

Motivations:

My priorities were:

  1. Are we getting in the users way?
  2. Does the API make sense?
  3. Does the API encourage best practices?

I was originally concerned about the lack of composability with
Option<Option<_>> and Option<Vec<_>> (and eventually Vec<Vec<_>>).
It prescribes special meaning to each type depending on where it shows
up, rather than providing a single meaning for a type generally. You
then can't do things like have Option<_> mean "required argument with
optional value" without hand constructing it. However, in practice the
outer type correlates with the argument occurrence and the inner type with the
value. It is rare to want the value behavior without also the
occurrence behavior. So I figure it is probably fine as long as people
can set the flags to manually get the behavior they want.

Vec<_> implies multiple occurrences, rather than multiple values.
Anecdotally, whenever I've used the old Arg::multiple, I thought I was
getting Arg::multiple_occurrences only. Arg::multiple_values,
without any bounds or delimiter requirement, can lead to a confusing
user experience and isn't a good default for these. On top of that, if
someone does have an unbounded or a delimiter multiple values, they are
probably also using multiple occurrences.

Vec<_> is optional because a Vec implies 0 or more, so we stick to
the meaning of the rust type. At least for me, I also rarely need a
required with multiple occurrences argument but more often need optional
with multiple occurrences.

Option<Vec<_>> ends up matching Vec<_> which can raise the question
of why have it. Some users might prefer the type. Otherwise, this is
so users can detect whether the argument is present or not when using
min_values(0). Rather than defining an entire policy around this and
having users customize it, or setting min_values(0) without the rest
of a default policy, this gives people a blank slate to work from.

Another design option would have been to not infer any special-type
settings if someone sets a handful of settings manually, which would
have avoided the confusion in Issue #2599 but I see that being confusing
(for someone who knows the default, they will be expecting it to be
additive; which flags disable inferred settings?) and brittle (as flags
are added or changed, how do we ensure we keep this up?).

Tests were added to ensure we support people customizing the behavior to
match their needs.

For now, I am leaving off:

  • Vec<Vec<_>>, see #2924
  • (T1, T2), Vec<(T1, T2)>, etc, see #1717
  • Vec<Option<_>> and many other potential combinations

To see this in action, see clap-rs/clap#2993

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

1 participant