Skip to content
This repository has been archived by the owner on Jan 1, 2022. It is now read-only.

Iterating over ArgMatches #90

Open
epage opened this issue Dec 6, 2021 · 30 comments
Open

Iterating over ArgMatches #90

epage opened this issue Dec 6, 2021 · 30 comments

Comments

@epage
Copy link
Owner

epage commented Dec 6, 2021

Issue by neysofu
Friday Mar 09, 2018 at 12:04 GMT
Originally opened as clap-rs/clap#1206


Hi, I am using Clap in a little side project and I would like to suggest a minor feature. I am ready to implement it myself and send a PR if needed, I just need a few pointers. The feature at hand is an ordered iterator over ArgMatches: as far as my understanding goes, Clap doesn't allow to freely iterate over the parsed arguments. What if my application APIs are dependent upon the order of the arguments? GNU find comes to my mind. Calling index_of for every argument can be expensive if you have a lot of them.

If this feature were to be implemented, what would the return type and name of the method be?

@epage
Copy link
Owner Author

epage commented Dec 6, 2021

Comment by kbknapp
Monday Mar 19, 2018 at 17:58 GMT


Sorry for the wait, I've been on vacation with my kids this past week :)

Do you mean iterating over only positional values, or over all args that have been used? Also, do you envision this iterating over non-matched args as well?

If you'd like to take a stab at implementing I'd be happy to mentor you! This is how I'd attack it:

  • Draw out how you see this working, including the questions I listed above (including any edge cases you can think of)
  • I start with some usage code (such as in a test) to make sure the ergonomics are correct/comfortable
  • Write some tests in tests/matches.rs for the various uses you've drawn up in step one
  • I then just test that file alone with cargo test --test matches or can further test just a single function in that file with cargo test --test matches -- name_of_function
  • Start the implementation (this will vary greatly depending on how you've designed it)

For the implementation once we have a good design I can give you some pointers as to where the changes you'll want to make are located / any particulars associated with those changes.

Once all your new tests pass, I'd run a full test suite cargo test --features "yaml vec_map wrap_help unstable" and make sure nothing broke.

If you need to see the debug output while you're building testing a single function use cargo test --test matches --features debug -- name_of_function

@epage
Copy link
Owner Author

epage commented Dec 6, 2021

Comment by neysofu
Monday Mar 19, 2018 at 19:54 GMT


First of all, thanks for taking the time to address this! I hope you enjoyed your vacation.

The API I had in mind wouldn't be constrained only to positional arguments, but to all the arguments that have been used. Currently ArgMatches allows you to know which arguments have been used and their index in respect to all other arguments, yet it lacks a way to actually iterate over them.

I mean adding an API to access the order in which options are given. Ex.:

cmd --abc --xyz
cmd --xyz --abc

Let's say my command's behavior is specific to the order in which the options abc and xyz are given. As long as it's only two arguments I can get away with calling index_of for one of the two and figuring out which comes first. But what if I provide many another options? My idea is offer a method over ArgMatches that gives an iterator over the arguments in an ordered fashion:

arg_matches_1.iter_raw() == vec!["abc", "xyz"];
arg_matches_2.iter_raw() == vec!["xyz", "abc"];

This example is just off the top of my head and I am sure it could be improved.

Please correct me if I talked nonsense at any point because I am not very familiar with the API.

@epage
Copy link
Owner Author

epage commented Dec 6, 2021

Comment by kbknapp
Monday Mar 19, 2018 at 20:25 GMT


To make sure it's not an XY problem, how does your CLI work with the order of args? I've always seen conflicts/overrides as the main driving factor for when order matters. Both of which handled in the parsing logic.

The only other form I've seen is for things like include/exclude logic where there aren't really hard conflicts/overrides at the parsing level, but further up the execution stack there are. I've only seen this between a handlful of args at the most, and hence iterating over the indices isn't an issue.

This is of course not to say my list is exhaustive, I'm sure there are other use cases out there!

This actually wouldn't be a huge change for clap to implement this, but it would introduce an additional dep (such as indexmap ) and thus would need to be behind a non-default cargo feature.

The change you'd make is using cfg statements to change the hashmap to an indexmap and provide an ArgMatches method to simply return an Iterator of the requested info, perhaps as a tuple? (Such as (name, &[vals])?) This could be similar to how Arg::values_of works.

For the iterator itself, it'd fine to use the Values impl as a guide.

@epage
Copy link
Owner Author

epage commented Dec 6, 2021

Comment by kbknapp
Monday Mar 19, 2018 at 20:28 GMT


Another more lengthy option (but arguably more correct) is instead of returning an Iterator of some custom data type tuple, returning the MatchedArg itself. I say more lengthy because you'd have to document MatchedArg and provide the appropriate methods to get at it's internals such as values, occurrences, and indices. However, this is more flexible in the long run as you wouldn't need to add more ArgMatches methods in order to deal with different types of iteration...you'd just let the calling code deal with it has it wants.

@epage
Copy link
Owner Author

epage commented Dec 6, 2021

Comment by neysofu
Tuesday Mar 20, 2018 at 15:22 GMT


My application has a custom version subcommand that works this way:

// Prints human readable version if no options are given
$ cmd version
Cmd 2.42.1-rc1 (fj23o83b)

// Switches to line-by-line format if options are given, to facilitate machine readability. The requested data is printed in the specified order.
$ cmd version --tags --major
rc1
2

However there are many other use cases. Tasks execution order in a Gulp-like application, apply filters in a specific order in GNU find, ecc.

@epage
Copy link
Owner Author

epage commented Dec 6, 2021

Comment by neysofu
Tuesday Mar 20, 2018 at 15:24 GMT


This actually wouldn't be a huge change for clap to implement this, but it would introduce an additional dep (such as indexmap ) and thus would need to be behind a non-default cargo feature.

That's a pity. I thought it would be as simple as adding a method to ArgMatches, with no fundamental changes to the actual program.

@epage
Copy link
Owner Author

epage commented Dec 6, 2021

Comment by kbknapp
Wednesday Mar 21, 2018 at 00:33 GMT


Right now, clap uses a HashMap to hold the matches, which is inherently not in order. Switching to an IndexMap would be basically a drop in replacement, but because it's another dep I'd want people to have to opt-in to it.

So it basically would be just a cfg statement which adds the method on the matches struct and uses an indexmap instead of hashmap. And I'd create a struct to act as the iterator, and impl Iterator for it.

Something like (details left out for brevity):

#[cfg(not(feature = "indexmap"))]
struct ArgMatches {
    args: HashMap
}

#[cfg(feature = "indexmap")]
struct ArgMatches {
    args: IndexMap
}

impl ArgMatches {
    #[cfg(feature = "indexmap")]
    fn iter(&self) -> MatchIter {
        MatchIter { iter: self.args.iter() }
    }
}

#[cfg(feature = "indexmap")]
struct MatchIter {
    iter: // whatever IndexMap::iter returns
}

#[cfg(feature = "indexmap")]
impl Iterator for MatchIter {
    type Item = // whatever args.iter().next() returns

    fn next(&mut self) -> Self::Item {
        self.iter().next()
    }
}

So it's a pretty straight forward change.

@epage
Copy link
Owner Author

epage commented Dec 6, 2021

Comment by neysofu
Wednesday Mar 21, 2018 at 04:44 GMT


Thanks for the tips, I'll start working on a prototype then. 👍

@epage
Copy link
Owner Author

epage commented Dec 6, 2021

Comment by neysofu
Saturday Mar 24, 2018 at 21:49 GMT


I feel hesitant adding an iter method to ArgMatches; it feels like a duplicate of indexmap's functionality. ArgMatches.args: IndexMap is already public, so what's wrong with the user doing matches.args.iter()?

@epage
Copy link
Owner Author

epage commented Dec 6, 2021

Comment by neysofu
Saturday Mar 24, 2018 at 23:19 GMT


You can take a look at the prototype here. There is some more work to do, specifically:

  • write documentation for both the feature and the iter method;
  • remove the feature from the default set (it was a quick hack to make it build every time without passing arguments);
  • add some more tests.

Please let me know how you like the API and if you want any changes. Congrats on the codebase BTW, I know this is a small change but still, it is a pleasure to dive in.

@epage
Copy link
Owner Author

epage commented Dec 6, 2021

Comment by kbknapp
Sunday Mar 25, 2018 at 03:17 GMT


I feel hesitant adding an iter method to ArgMatches; it feels like a duplicate of indexmap's functionality. ArgMatches.args: IndexMap is already public, so what's wrong with the user doing matches.args.iter()?

My current policy is public, but undocumented internals are effectively private and should not be relied on. Sure, I can't stop anyone from using them directly, but there's the caveat that if I change them without warning it's considered a breaking change. Hence the #[doc(hidden)]

Most of the time public but undocumented is because it needs to be public because other mods (or eventually crates) need to touch it, but those mods (or crates) are maintained by myself or other clap members and thus it's essentially just a private API.

@epage
Copy link
Owner Author

epage commented Dec 6, 2021

Comment by kbknapp
Sunday Mar 25, 2018 at 03:22 GMT


Looks like a good start! It's a small nit, but let's change the feature name to iter_matches because iter is somewhat ambiguous and may conflict with other types of iteration we may do in the future.

Let me know if you have any other questions 👍

@epage
Copy link
Owner Author

epage commented Dec 6, 2021

Comment by neysofu
Sunday Mar 25, 2018 at 05:18 GMT


I'll create a PR for discussing further details, sounds good?

@epage
Copy link
Owner Author

epage commented Dec 6, 2021

Comment by kbknapp
Sunday Mar 25, 2018 at 19:21 GMT


Yep, I'd actually prefer it that way because it's easier to comment and review 😄 👍

@epage
Copy link
Owner Author

epage commented Dec 6, 2021

Comment by XX
Saturday Apr 14, 2018 at 20:07 GMT


I also need this functionality

@epage
Copy link
Owner Author

epage commented Dec 6, 2021

Comment by bradwood
Sunday Apr 26, 2020 at 19:57 GMT


Hi all, I'm also looking to use something like this. I'd like to process my args in a functional fashion. The PR, however, seems to have died a death. Given that 3.0 is on the horizon how much of a bad idea is it for me to just use matches.args.iter() in 2.x? Also, will this make it into 3.0?

PS, I don't need ordering.

@epage
Copy link
Owner Author

epage commented Dec 6, 2021

Comment by CreepySkeleton
Monday Apr 27, 2020 at 07:59 GMT


@bradwood Not as bad as you'd think: 2.x branch is frozen. I'd be really surprised if we change something regarding publicity of those fields because there is plenty of code out there relying on those, ugh. So, if you aren't planning on switching to 3.0 and you're OK with the fact that every time you use private API god kills a kitten, you can do that, yes.

Beware: those will be truly private in 3.0.

I suspect it won't make it into 3.0. The main question is: what are you going to iterate through? You can't have &Arg from ArgMatches because it stores Ids, basically a hash of those. On doesn't simply restore the Arg from the hash.

@epage
Copy link
Owner Author

epage commented Dec 6, 2021

Comment by bradwood
Monday Apr 27, 2020 at 12:32 GMT


i am trying to call map() on each arg I find to invoke the appropriate handler function based on the argument passed...

And, as regards the kittens, I'm cool with that! :)

@epage
Copy link
Owner Author

epage commented Dec 6, 2021

Comment by CreepySkeleton
Monday Apr 27, 2020 at 12:39 GMT


clap has Arg::validator that is called on every value passed in. And of course, you can always iterate over std::env::args() and map it as you please; you don't really need clap for that.

@epage
Copy link
Owner Author

epage commented Dec 6, 2021

Comment by pksunkara
Monday Apr 27, 2020 at 14:48 GMT


I think I can see a certain use case for this where people want to do dependency injection pattern for evaluating their args.

@epage
Copy link
Owner Author

epage commented Dec 6, 2021

Comment by kbknapp
Monday Apr 27, 2020 at 15:37 GMT


[I] am trying to call map() on each arg I find to invoke the appropriate handler function based on the argument passed...

For a while I was floating around the idea of adding filter/map capabilities to arguments themselves, in a similar manner to the current validator. In essence it's just a chance for people to call a function when a particular argument/value is used and do something with its value persistently. Some use cases were, "I want my CLI to accept options yes|no but I want the value held my context struct to be some concrete type."

But others also wanted more a "handler" type functionality (which is something some python libraries offer). This can be done already by (hackily) using the validator functionality.

Currently, the mapping of values can be done pretty easily after the parsing in user code manually, but not during parsing.

I'm not 100% sure where I fall on the thinking of this one though, as is the added complexity and public API space worth the small benefit of doing something at parse time instead of manually afterwards?

Also perhaps more importantly, have users thought through what happens if a particular map/filter/handler whatever we'd call it has side affects, but parsing ultimately fails? Maybe that's just a, "hey probably best you don't have side affects, but if you do be aware it can have XYZ consequences." I mean you can already (mis)use the validators for handlers, so I'd be less inclined to include that functionality.

@epage
Copy link
Owner Author

epage commented Dec 6, 2021

Comment by CreepySkeleton
Monday Apr 27, 2020 at 17:02 GMT


I want my CLI to accept options yes|no but I want the value held my context struct to be some concrete type

Resolved by derive (and partially by value_of_t* methods).

I'm not sure I get your "handlers" idea, I imagine it's something akin clap-rs/clap#1185:

a single Arg::custom_parse which accepts a function that takes the arg/value currently being parsed, and returns an Option where None is skip this value (i.e. parse it as a different arg), or Some(T) is "save this to the matches" it could be the value as passed in, or it could be something different entirely (i.e. a map).

Some thoughts:

  • When is it ran, exactly? Before default values/environment variables were populated or after? What about required/conflicts stuff? How does it interact with old-style validators? With groups? Possible values?

    I'm not sure I'm ready to answer these questions.

  • We've already got the flush-related bug. The problem here that clap exits with sys::exit() which implies that no destructors are being accounted for, no finallizers are being run, the text still awaiting to be processed by IO facilities is lost, etc... When I'm reading the description above, I'm given impression that "heavy" side-effect functions are OK, so will be users.

In nutshell: I think this kind of tool is too vague and too powerful for a command line parsing library to provide. I don't think clap should be trying to accommodate every single CLI in existence, the number of well-supported designs is quite enough already.

@epage
Copy link
Owner Author

epage commented Dec 6, 2021

Comment by bradwood
Monday Apr 27, 2020 at 17:38 GMT


FWIW, I'm not needing these map()s to run during arg parsing, I'm happy to pass an iterator of matched arguments into a function pipeline. I'll take a look at the validator, as I've not really looked properly at that. It might work for my use case.

@epage
Copy link
Owner Author

epage commented Dec 6, 2021

Comment by CreepySkeleton
Monday Apr 27, 2020 at 17:47 GMT


@bradwood Out of morbid curiosity, what is the core problem you're trying to solve? Why do you think map is a good way to approach this kind of problem?

I suspect you're just "doing it wrong".

@epage
Copy link
Owner Author

epage commented Dec 6, 2021

Comment by kbknapp
Monday Apr 27, 2020 at 18:48 GMT


Resolved by derive (and partially by value_of_t* methods).

Exactly. I'm describing a problem/issue that existed prior to derive macros. The "handlers" was just a common theme that popped up every so often, especially when people were moving from Python based CLIs which offer a "when this argument is used, run this function." Although that's a totally different way to build/design a CLI and thus I don't think it's really applicable to how clap runs.

I don't think clap should be trying to accommodate every single CLI in existence, the number of well-supported designs is quite enough already.

100% agree. I'm just providing some historic context 😉


@bradwood I agree with @CreepySkeleton it sounds like you may be approaching the problem from a particular design angle and we may be seeing it as an XY problem.

Some of the principal issues with a feature mapping over arguments (and their values) is:

  • For us (clap) to provide it, it needs to be general enough to fit all (or most) use cases, and with feature in particular there are a lot of things to consider and edge cases
  • If we're talking about true map (i.e. T->U), that can essentially already be done easily as @CreepySkeleton mentioned
  • If we're talking about handlers (i.e. perform some action for each argument); unless these actions are only blind side affects they:
    • need some sort of context passed to/from them (other args, maybe application context, etc.)
    • cannot rely on deterministic order or hierarchy
    • most likely need the concept of fallibility (what happens on failure)

Those final three bullets are some of what making this feature a general clap feature difficult. If you don't have any of those requirements and are truly only wanting blind side affects, I'd question if there may be a better way to represent the CLI.

If you can give some examples of what you're trying to do, we may be able to either point you in a direction or give you a way to accomplish that task with existing features.

@epage
Copy link
Owner Author

epage commented Dec 6, 2021

Comment by bradwood
Monday Apr 27, 2020 at 19:02 GMT


Heh. Given that I'm only 3 weeks in and coming from Python, I'm almost certainly doing it wrong! 😄

Here's the situation:

  • I have a subcommand with 1 positional and about 20 non-positional arguments (some are just flags, and some take 1 or more values).
  • I want to drive another builder API with arguments passed in via clap. There is (almost) a 1-to-1 relationship between clap argument and the other API's builder methods
  • I could do a massive chain of if statement (although maybe a match would be better) that goes something like the below pseudo-code
if arg_n passed then:
     builder = builder.fn_n(arg's value[s]);
 else if..
(repeat 30 times)

Instead, more as a academic exercise than anything else, I'm trying to figure out if there's a more elegant way of doing in in a functional style... something like:

    // NOTE: args.args is a public, but _undocumented_ field in clap.rs
    args.args.iter()
        .map(
            |(k,v)| println!("{} {:?}",*k ,v) //replace this with a mechanism that I can use to call the appropriate builder method
            ).count(); // I'm using this to drive the function pipeline, purely for the side effects, rather than for the resulting value (I accept that this is probably not idiomatic)

What the aforementioned mechanism is, I'm not quite sure (yet)... but I was hoping to figure out a way to look up the other API's builder method that I need to call using the name of the clap argument passed...

I wait with bated breath for you to give me some beautiful idiomatic pattern that'll make this into anything other than 30 if statements!

Thanks!

@epage
Copy link
Owner Author

epage commented Dec 6, 2021

Comment by kbknapp
Monday Apr 27, 2020 at 19:17 GMT


Yours is not a super common case. Essentially, your CLI is a facade for a builder API? A few things about what you mention stood out to me (emphasis mine):

There is (almost) a 1-to-1 relationship between clap argument and the other API's builder methods

Unless you have a way of mapping those non 1:1 arguments, you'll end up with errors. So dealing with those errors is the fallible requirement I mentioned above.

  1. In your example you use println! which kind of hides the problem, but in your pythonesque example you have builder = builder.fn_n which means each of these maps/handlers/whatever need to receive context, mutate it, and send it to the next fn (in this case its the builder). This is the context sending I mentioned above.

Although it's not the 2 line map you're hoping for, the most idiomatic way to deal with this type of situation is via the clap derive macros. Your CLI would evaluate to some type, lets call it MyRawOpts, and your other builder would then impl From<MyRawOpts>. And by using some combination of macros you could decrease the verbosity of the From<..> impl to essentially a loop and map.

@epage
Copy link
Owner Author

epage commented Dec 6, 2021

Comment by kbknapp
Monday Apr 27, 2020 at 19:21 GMT


Although it's a more advanced topic, and probably defeats the purpose of what you're trying to do, you could combine your own Custom Derive macro, and do something like #[derive(Clap, IntoMyBuilderApi)] which would instead impl Into<MyBuilderApi> for MyRawOpts automatically, and using things like the syn crate loop through the struct fields of MyRawOpts simply applying them to your builder.

@epage
Copy link
Owner Author

epage commented Dec 6, 2021

Comment by bradwood
Monday Apr 27, 2020 at 19:27 GMT


I'm still reeling from everything you just wrote below the horizontal rule 2 comments above... 🤔. I'm going to have to do a whole lot of research on this, which is no bad thing... Thanks for your help (and patience) here!

As a final question, if I may, where would I look for examples of Clap and derive macros. Is this documented somewhere? I'm digging through the docs but not finding much on this topic (but I might not be looking in the right place).

@epage
Copy link
Owner Author

epage commented Dec 6, 2021

Comment by kbknapp
Monday Apr 27, 2020 at 19:39 GMT


The clap derive macros are the new big feature of clap 3.x which is still in beta and thus not formally released yet. So the documentation is still lacking. The source is under the clap_derive crate in the root of this repository. You can look at the examples directory of clap_derive for some ideas.

If you'd prefer, you can use structopt which is well documented, until clap 3x is released. structopt uses clap 2.x internally, and all the clap 3.x macros are based on structopt (i.e. for clap 3.x we combined efforts between the two crates 😉).

For info on writing Custom Derive macros:

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

1 participant