Improve how we infer behavior from field types #4626

tmccombs · 2023-01-11T08:24:37Z

As brought up in #4600 and #3661, there are cases where it would be nice to support additional types with special semantics for clap_derive fields. However, doing so for any existing isn't backwards compatible, because users might have custom value parsers that produce those types. In addition, currently in order to opt out of special behavior for field types of Opt or Vec, you need to fully qualify the paths for those types (ex. ::std::vec::Vec), which is both undocumented, and non-obvious. It would be nice to have a consistent, easy to understand way to specify how 0 or more values for an option are aggregated into a single type.

In particular here are some cases that are currently difficult or impossible to include in a Parser derived using clap_derive:

Vec<Vec<T>> containing the values passed in, grouped by occurrence
[T; N], where num_args is set to N
Vec<[T; N]> would be a combination of the two above (with num_args set to N, but action set to Append)
Result<T, SomeError> for an optional argument (like Option<T>, but return some default error if missing)
Probably others

The current types with special behavior are described here. It should also be noted, that if you specify an action of Count with an integral field type, also sort of fits in this category.

Requirements

Derive code continues to be easy to write and read
Can both configure the Arg and extract the value from ArgMatches for our specialized types, including
- Option<T>
- Option<Option<T>>
- Vec<T>
- Option<Vec<T>>
- Vec<Vec<T>> (v5)
- Option<Vec<Vec<T>> (v5)
- bool
- ()
- [T; N] (clap_derive: support fixed arrays as shortcut for number_of_values = N #1682)

Options

Add semantics for additional types in a breaking change

This perhaps the most straightforward way to do it. But it has several downsides:

It requires breaking changes, which might not be caught by the compiler
It adds more "magic" to how types communicate information to the derive macro
It increases the importance of being able to opt

Add an additional argument to the arg attribute

We could add one or more additional arguments to the arg attribute on fields.

There are a few ways this could be done:

Have a different argument for each kind of semantics, such as optional, all_values, optional_values, occurrences, optional_occurrences, etc.
Add a single new argument that is given an enumeration value that is effectively the same as the current clap_derive::utils::ty::Ty enum (although we might want to use more descriptive names)
Add a single new argument that is passed an instance of a trait that specifies how to extract the value from the ArgMatches, see User-extensible trait
Add a single new argument that opts in to treating all supported standard types specially.

1-3 have the downside that either the attribute must agree with the type of the field, or there has to be some magic conversion of the field type.

4 would make it difficult to express sitiutations Option<Vec<MyType>> where the value parser parses a Vec<MyType>, but the Option is because the option is optional. It also still has the backwards
compatibility problem if we add anything new in the future.

User-extensible trait

For 3 above, we can define a trait that encapsulates the wanted behavior, and give the derive macro an implementation of that trait. While it can be defined for standard types in clap_derive, it would
also be possible to allow users to create their own implementations, opening up additional ways to expand this functionality in other crates.

Note: bikeshed on the actual names

trait AggregateParser {
  type Aggregate;

  /// Make any necessary changes to the `Arg` before it is used in a `Command`
  fn modify_arg(&self, arg: clap::Arg) -> clap::Arg {
    arg
  }

  fn get(&self, matches: &mut clap::ArgMatches, name: &str) -> Self::Aggregate;
}

And one possible implementation would be:

struct OptionalValues<T>(PhantomData<T>); // goes with `Option<Vec<T>>`
impl<T> AggregateParser for OptionalValues<T> {
  type Aggregate = Option<Vec<T>>;

  fn modify_arg(arg: clap::Arg) -> clap::Arg {
    if arg.is_positional() {
      arg.num_args(1..)
    } else {
      arg
    }
  }

  fn get(matches: &mut clap::ArgMatches, name: &str) -> Self::Aggregate {
    matches.remove_many::<T>(name).map(|v| v.map(Iterator::collect))
  }
}

This could potentially use GATs, so that it isn't necessary to include type parameters to the type itself.

Expected Example Code

// 1. different arguments
#[arg(long, optional)]
a: Option<u32>,
#[arg(long, flag)]
b: bool,
#[arg(long, occurrences)]
c: Vec<Vec<String>>,
#[arg(long, optional_option)]
d: Option<Option<String>>,
// etc.

// 2. enumeration
#[arg(long, aggregation = AggregationType::Optional)]
a: Option<u32>,
#[arg(long, aggregation = AggregationType::Flag)]
b: bool,
#[arg(long, aggregation = AggregationType::Occurrences)]
c: Vec<Vec<String>>,
// etc.

// 3. trait
#[arg(long, aggregation = Optional<u32>)]
a: Option<u32>,
#[arg(long, aggregation = Flag)]
b: bool,
#[arg(long, aggregation = Occurrences<String>)]
c: Vec<Vec<String>>,

// 4. single arg
#[arg(long, magic_types)]
a: Vec<Vec<String>>

Support parsing a macro invocation inside the struct definition

This might look like:

optional!(T) is equivalent to the current Option<T>
values!(T) is equivalent to the current Vec<T>
flag!() is equivalent to the current bool
occurrences!(T) would expand to Vec<Vec<T>> and opt in to getting a vec of values for each occurrence.
either have optional_vec!(T) and optional_occurrences!(T) or allow optional!(values!(T)) and optional!(occurrences!(T)).
Maybe for completeness also have required!(T) which is the same as T but would make the option required?

These all use the Value type T, but it could also use the final type instead (for example: occurrences!(Vec<Vec<T>>))

Custom macros

This method could potentially support custom behavior by delegating to actual macro invocations in the generated impl. If the macro used looked like example!(T) then the generated code would have the following:

When implementing CommandFactory invoking the macro like: example!(T, arg: arg_value) where arg_value is the Arg for the field, and it returns a new Arg to actually use, so that it can make modifications to the argument.
The type of the field in the struct itself would be example!(T)
When implementing FromArgMatches, to get the value of the field, invoke example!(T, matches: matches_val, name) where matches_val is an ArgMatches and name is the name of the option to extract.

Note the similarity between this and the custom trait in the section on using an attribute

Expected Example Code

#[arg(long)]
a: optional!(u32),
#[arg(long)]
b: flag!(),
#[arg(long)]
c:  occurrences!(String),
#[arg(long, optional_option)]
d: optional_option!(String),

Use newtypes

Wrapper types might be the least confusing, since they would be clearly intended for the purpose of communicating how the field should be parsed. However, they are probably the least ergonomic to use, since you would often have to unwrap the value. Although... I'm not sure how to determine for sure that the path actually points to the right type, and not a user-defined type with the same name, without require the type to be fully specified.

From @epage:

One option would be to use deref specialization. We could move some (or all?) of the type handling to a macro that takes in a field's type and uses deref specialization to find the right Arg policy type which would have a function to apply that policy to an Arg

Another possibility is we could have the newtypes implement a trait similar to the one defined above in User-extensible trait (but with Self instead of Self::Aggregate) for the newtypes. Tha could also potentially make it user-extensible, but we still have the question of how to identify if a type is one of these types and extract the type for the value parser. Perhaps it could be combined witht he autoref specialized macro @epage mentioned above? Or use an attribute. Probably if it is user-extensible we would require that it has 0 or 1 type parameters, and if it is 1, use that to infer the type for th value parser. Or maybe we could use an associated type?

Expected Example Code

#[arg(long)]
a: Optional<u32>,
#[arg(long)]
b: Flag,
#[arg(long, occurrences)]
c: Occurrences<String>,
#[arg(long)]
d: OptionalOption<String>,

Footnote:
I haven't put a lot of thought into the specific names for types, macros, attribute arguments etc. We can hash out better names for those once we decide on the general direction.

The text was updated successfully, but these errors were encountered:

epage · 2023-01-11T13:22:11Z

I've added a requirements section and changed the headers so it was easier to see what the top-level options are.

epage · 2023-01-11T13:22:38Z

Could you add "expected example code" to each section for what the common case and the exception cases would be for user code?

This let's you get an arguments values, grouped by the occurrence of the argument. Note: this does not stablize derive support. That requires a blocking change and can be enabled via `unstable-v5` flag. See clap-rs#4626 for an exploration of how we can make this easier in the future. Fixes clap-rs#2924

valkum · 2023-02-13T18:24:00Z

I was pointed here from #3114. I guess I came up with a rough idea that is basically approach 3 here but in broader aspect. Instead of passing a custom trait, I proposed adding a new argument to disable the generation of the FromArgMatches impl, allowing one to implement FromArgMatches for the specific collection in user code.
Is this something you thought of @tmccombs ? If not, as you thought about different approaches, what are your thoughts on this solution?

tmccombs · 2023-02-14T07:46:44Z

Are you suggesting having a way to derive CommandFactory without deriving FromArgMatches? I could see there being cases where that might be useful, but that removes a lot of the usefulness of clap_derive IMO. And I don't really see how it solves the original problem. Namely that there are additional types, such as Vec<Vec<T>> and [T; N] that we would like clap_derive to treat specially.

valkum · 2023-02-14T14:44:01Z

That was the initial idea. I guess it more related to custom containers than to support Vec<Vec<T>>. Currently, you need to implement CommandFactory, Subcommand, or Args yourself if you need your own FromArgMatches implementation.
I guess both problems can best benefit from approach 3.
For the types above, clap provides default aggregators and for custom types you can impl a trait yourself. Similar to serde's (de)serialize_with attribute.

The other options seem limiting regarding custom container support.

tgockel · 2023-06-15T20:25:09Z

Just adding a somewhat minimal real-world example of where this is useful, imagine I want to use a letter to represent a list (this came up in the context of logging):

use clap::Parser;

#[derive(Debug)]
struct MyError;

impl std::fmt::Display for MyError {
    fn fmt(&self, f: &mut std::fmt::Formatter) -> std::fmt::Result {
        write!(f, "MyError")
    }
}

impl std::error::Error for MyError { }

#[derive(Clone, Copy, Debug)]
enum Level {
    Debug,
    Info,
    Warning,
    Error,
}

impl Level {
    pub fn from_str(src: &str) -> Result<Vec<Level>, MyError> {
        let mut out = Vec::with_capacity(src.len());
        for c in src.chars() {
            match Self::try_from(c) {
                Ok(x) => out.push(x),
                Err(e) => return Err(e),
            }
        }
        Ok(out)
    }
}

impl TryFrom<char> for Level {
    type Error = MyError;

    fn try_from(value: char) -> Result<Self, MyError> {
        match value {
            'd' => Ok(Self::Debug),
            'i' => Ok(Self::Info),
            'w' => Ok(Self::Warning),
            'e' => Ok(Self::Error),
            _ => Err(MyError),
        }
    }
}


#[derive(Debug, Parser)]
#[command()]
struct Command {
    #[arg(long, value_parser = Level::from_str)]
    pub levels: Vec<Level>,
}

fn main() {
    let cmd = Command::parse();
    println!("Hello --> {:?}", cmd);
}

Running with --levels we fails at run time with the error #4830, since value_parser = Level::from_str applies to the elements, not the whole Vec<Level> as intended. Workaround is to denote Command::levels as the long-form std::vec::Vec<Level>. Functional, but probably not the ideal user experience.

Different way to workaround clap-rs/clap#4626.

epage added C-enhancement Category: Raise on the bar on expectations A-derive Area: #[derive]` macro API S-waiting-on-design Status: Waiting on user-facing design to be resolved before implementing labels Jan 11, 2023

This was referenced Jan 13, 2023

feat: Stablize ArgMatches::get_occurrences #4635

Merged

Could not downcast to type alias #4636

Closed

epage mentioned this issue Jan 30, 2023

value_parser that should return Vec<u8> panics with Mismatch between definition and access of Payload. Could not downcast to u8, need to downcast to alloc::vec::Vec<u8> #4481

Closed

2 tasks

epage mentioned this issue Feb 6, 2023

Ranged values for Vec<usize> or any number Vec type like 1-10 or 1..10 #4679

Closed

2 tasks

epage changed the title ~~Proposal: more extensible pattern for clap derive field types~~ Improve how we infer behavior from field types Feb 10, 2023

epage mentioned this issue Feb 13, 2023

Custom collections #3114

Open

epage mentioned this issue Mar 29, 2023

TypedValueParser cannot return single Vec<String> #4808

Closed

2 tasks

epage mentioned this issue Jun 27, 2023

Support parsing smart pointers #4989

Open

2 tasks

sveitser mentioned this issue Jul 4, 2023

Basic Discord Bot EspressoSystems/espresso-polygon-zkevm-demo#53

Merged

4 tasks

epage mentioned this issue Jul 19, 2023

derive(Parser) fails when using type alias #5019

Closed

2 tasks

This comment was marked as off-topic.

Sign in to view

epage mentioned this issue Aug 18, 2023

Option arg cannot have None as default value #5078

Closed

2 tasks

epage mentioned this issue Oct 9, 2023

RFC: Adding Support for Flag Arguments Without Values #5166

Closed

2 tasks

koxu1996 added a commit to cspr-rad/kairos that referenced this issue Feb 13, 2024

Replace type alias with absolute path import.

5480acf

Different way to workaround clap-rs/clap#4626.

koxu1996 mentioned this issue Feb 13, 2024

Base CLI parsing cspr-rad/kairos#17

Merged

koxu1996 added a commit to cspr-rad/kairos that referenced this issue Feb 26, 2024

Replace type alias with absolute path import.

0bc00c6

Different way to workaround clap-rs/clap#4626.

epage mentioned this issue Apr 16, 2024

clap_derive: support tuples as shortcut for number_of_values = N, different validators #1717

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve how we infer behavior from field types #4626

Improve how we infer behavior from field types #4626

tmccombs commented Jan 11, 2023 •

edited

epage commented Jan 11, 2023

epage commented Jan 11, 2023

valkum commented Feb 13, 2023

tmccombs commented Feb 14, 2023

valkum commented Feb 14, 2023

tgockel commented Jun 15, 2023

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

Improve how we infer behavior from field types #4626

Improve how we infer behavior from field types #4626

Comments

tmccombs commented Jan 11, 2023 • edited

Requirements

Options

Add semantics for additional types in a breaking change

Add an additional argument to the arg attribute

User-extensible trait

Expected Example Code

Support parsing a macro invocation inside the struct definition

Custom macros

Expected Example Code

Use newtypes

Expected Example Code

epage commented Jan 11, 2023

epage commented Jan 11, 2023

valkum commented Feb 13, 2023

tmccombs commented Feb 14, 2023

valkum commented Feb 14, 2023

tgockel commented Jun 15, 2023

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

tmccombs commented Jan 11, 2023 •

edited