Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve how we infer behavior from field types #4626

Open
tmccombs opened this issue Jan 11, 2023 · 13 comments
Open

Improve how we infer behavior from field types #4626

tmccombs opened this issue Jan 11, 2023 · 13 comments
Labels
A-derive Area: #[derive]` macro API C-enhancement Category: Raise on the bar on expectations S-waiting-on-design Status: Waiting on user-facing design to be resolved before implementing

Comments

@tmccombs
Copy link
Contributor

tmccombs commented Jan 11, 2023

As brought up in #4600 and #3661, there are cases where it would be nice to support additional types with special semantics for clap_derive fields. However, doing so for any existing isn't backwards compatible, because users might have custom value parsers that produce those types. In addition, currently in order to opt out of special behavior for field types of Opt or Vec, you need to fully qualify the paths for those types (ex. ::std::vec::Vec), which is both undocumented, and non-obvious. It would be nice to have a consistent, easy to understand way to specify how 0 or more values for an option are aggregated into a single type.

In particular here are some cases that are currently difficult or impossible to include in a Parser derived using clap_derive:

  • Vec<Vec<T>> containing the values passed in, grouped by occurrence
  • [T; N], where num_args is set to N
  • Vec<[T; N]> would be a combination of the two above (with num_args set to N, but action set to Append)
  • Result<T, SomeError> for an optional argument (like Option<T>, but return some default error if missing)
  • Probably others

The current types with special behavior are described here. It should also be noted, that if you specify an action of Count with an integral field type, also sort of fits in this category.

Requirements

Options

Add semantics for additional types in a breaking change

This perhaps the most straightforward way to do it. But it has several downsides:

  1. It requires breaking changes, which might not be caught by the compiler
  2. It adds more "magic" to how types communicate information to the derive macro
  3. It increases the importance of being able to opt

Add an additional argument to the arg attribute

We could add one or more additional arguments to the arg attribute on fields.

There are a few ways this could be done:

  1. Have a different argument for each kind of semantics, such as optional, all_values, optional_values, occurrences, optional_occurrences, etc.
  2. Add a single new argument that is given an enumeration value that is effectively the same as the current clap_derive::utils::ty::Ty enum (although we might want to use more descriptive names)
  3. Add a single new argument that is passed an instance of a trait that specifies how to extract the value from the ArgMatches, see User-extensible trait
  4. Add a single new argument that opts in to treating all supported standard types specially.

1-3 have the downside that either the attribute must agree with the type of the field, or there has to be some magic conversion of the field type.

4 would make it difficult to express sitiutations Option<Vec<MyType>> where the value parser parses a Vec<MyType>, but the Option is because the option is optional. It also still has the backwards
compatibility problem if we add anything new in the future.

User-extensible trait

For 3 above, we can define a trait that encapsulates the wanted behavior, and give the derive macro an implementation of that trait. While it can be defined for standard types in clap_derive, it would
also be possible to allow users to create their own implementations, opening up additional ways to expand this functionality in other crates.

Note: bikeshed on the actual names

trait AggregateParser {
  type Aggregate;

  /// Make any necessary changes to the `Arg` before it is used in a `Command`
  fn modify_arg(&self, arg: clap::Arg) -> clap::Arg {
    arg
  }

  fn get(&self, matches: &mut clap::ArgMatches, name: &str) -> Self::Aggregate;
}

And one possible implementation would be:

struct OptionalValues<T>(PhantomData<T>); // goes with `Option<Vec<T>>`
impl<T> AggregateParser for OptionalValues<T> {
  type Aggregate = Option<Vec<T>>;

  fn modify_arg(arg: clap::Arg) -> clap::Arg {
    if arg.is_positional() {
      arg.num_args(1..)
    } else {
      arg
    }
  }

  fn get(matches: &mut clap::ArgMatches, name: &str) -> Self::Aggregate {
    matches.remove_many::<T>(name).map(|v| v.map(Iterator::collect))
  }
}

This could potentially use GATs, so that it isn't necessary to include type parameters to the type itself.

Expected Example Code

// 1. different arguments
#[arg(long, optional)]
a: Option<u32>,
#[arg(long, flag)]
b: bool,
#[arg(long, occurrences)]
c: Vec<Vec<String>>,
#[arg(long, optional_option)]
d: Option<Option<String>>,
// etc.

// 2. enumeration
#[arg(long, aggregation = AggregationType::Optional)]
a: Option<u32>,
#[arg(long, aggregation = AggregationType::Flag)]
b: bool,
#[arg(long, aggregation = AggregationType::Occurrences)]
c: Vec<Vec<String>>,
// etc.

// 3. trait
#[arg(long, aggregation = Optional<u32>)]
a: Option<u32>,
#[arg(long, aggregation = Flag)]
b: bool,
#[arg(long, aggregation = Occurrences<String>)]
c: Vec<Vec<String>>,

// 4. single arg
#[arg(long, magic_types)]
a: Vec<Vec<String>>

Support parsing a macro invocation inside the struct definition

This might look like:

  • optional!(T) is equivalent to the current Option<T>
  • values!(T) is equivalent to the current Vec<T>
  • flag!() is equivalent to the current bool
  • occurrences!(T) would expand to Vec<Vec<T>> and opt in to getting a vec of values for each occurrence.
  • either have optional_vec!(T) and optional_occurrences!(T) or allow optional!(values!(T)) and optional!(occurrences!(T)).
  • Maybe for completeness also have required!(T) which is the same as T but would make the option required?

These all use the Value type T, but it could also use the final type instead (for example: occurrences!(Vec<Vec<T>>))

Custom macros

This method could potentially support custom behavior by delegating to actual macro invocations in the generated impl. If the macro used looked like example!(T) then the generated code would have the following:

  • When implementing CommandFactory invoking the macro like: example!(T, arg: arg_value) where arg_value is the Arg for the field, and it returns a new Arg to actually use, so that it can make modifications to the argument.
  • The type of the field in the struct itself would be example!(T)
  • When implementing FromArgMatches, to get the value of the field, invoke example!(T, matches: matches_val, name) where matches_val is an ArgMatches and name is the name of the option to extract.

Note the similarity between this and the custom trait in the section on using an attribute

Expected Example Code

#[arg(long)]
a: optional!(u32),
#[arg(long)]
b: flag!(),
#[arg(long)]
c:  occurrences!(String),
#[arg(long, optional_option)]
d: optional_option!(String),

Use newtypes

Wrapper types might be the least confusing, since they would be clearly intended for the purpose of communicating how the field should be parsed. However, they are probably the least ergonomic to use, since you would often have to unwrap the value. Although... I'm not sure how to determine for sure that the path actually points to the right type, and not a user-defined type with the same name, without require the type to be fully specified.

From @epage:

One option would be to use deref specialization. We could move some (or all?) of the type handling to a macro that takes in a field's type and uses deref specialization to find the right Arg policy type which would have a function to apply that policy to an Arg

Another possibility is we could have the newtypes implement a trait similar to the one defined above in User-extensible trait (but with Self instead of Self::Aggregate) for the newtypes. Tha could also potentially make it user-extensible, but we still have the question of how to identify if a type is one of these types and extract the type for the value parser. Perhaps it could be combined witht he autoref specialized macro @epage mentioned above? Or use an attribute. Probably if it is user-extensible we would require that it has 0 or 1 type parameters, and if it is 1, use that to infer the type for th value parser. Or maybe we could use an associated type?

Expected Example Code

#[arg(long)]
a: Optional<u32>,
#[arg(long)]
b: Flag,
#[arg(long, occurrences)]
c: Occurrences<String>,
#[arg(long)]
d: OptionalOption<String>,

Footnote:
I haven't put a lot of thought into the specific names for types, macros, attribute arguments etc. We can hash out better names for those once we decide on the general direction.

@epage epage added C-enhancement Category: Raise on the bar on expectations A-derive Area: #[derive]` macro API S-waiting-on-design Status: Waiting on user-facing design to be resolved before implementing labels Jan 11, 2023
@epage
Copy link
Member

epage commented Jan 11, 2023

I've added a requirements section and changed the headers so it was easier to see what the top-level options are.

@epage
Copy link
Member

epage commented Jan 11, 2023

Could you add "expected example code" to each section for what the common case and the exception cases would be for user code?

epage added a commit to epage/clap that referenced this issue Jan 13, 2023
This let's you get an arguments values, grouped by the occurrence of the
argument.

Note: this does not stablize derive support.  That requires a blocking
change and can be enabled via `unstable-v5` flag.  See clap-rs#4626 for an
exploration of how we can make this easier in the future.

Fixes clap-rs#2924
@epage epage changed the title Proposal: more extensible pattern for clap derive field types Improve how we infer behavior from field types Feb 10, 2023
@valkum
Copy link

valkum commented Feb 13, 2023

I was pointed here from #3114. I guess I came up with a rough idea that is basically approach 3 here but in broader aspect. Instead of passing a custom trait, I proposed adding a new argument to disable the generation of the FromArgMatches impl, allowing one to implement FromArgMatches for the specific collection in user code.
Is this something you thought of @tmccombs ? If not, as you thought about different approaches, what are your thoughts on this solution?

@tmccombs
Copy link
Contributor Author

Are you suggesting having a way to derive CommandFactory without deriving FromArgMatches? I could see there being cases where that might be useful, but that removes a lot of the usefulness of clap_derive IMO. And I don't really see how it solves the original problem. Namely that there are additional types, such as Vec<Vec<T>> and [T; N] that we would like clap_derive to treat specially.

@valkum
Copy link

valkum commented Feb 14, 2023

That was the initial idea. I guess it more related to custom containers than to support Vec<Vec<T>>. Currently, you need to implement CommandFactory, Subcommand, or Args yourself if you need your own FromArgMatches implementation.
I guess both problems can best benefit from approach 3.
For the types above, clap provides default aggregators and for custom types you can impl a trait yourself. Similar to serde's (de)serialize_with attribute.

The other options seem limiting regarding custom container support.

@tgockel
Copy link

tgockel commented Jun 15, 2023

Just adding a somewhat minimal real-world example of where this is useful, imagine I want to use a letter to represent a list (this came up in the context of logging):

use clap::Parser;

#[derive(Debug)]
struct MyError;

impl std::fmt::Display for MyError {
    fn fmt(&self, f: &mut std::fmt::Formatter) -> std::fmt::Result {
        write!(f, "MyError")
    }
}

impl std::error::Error for MyError { }

#[derive(Clone, Copy, Debug)]
enum Level {
    Debug,
    Info,
    Warning,
    Error,
}

impl Level {
    pub fn from_str(src: &str) -> Result<Vec<Level>, MyError> {
        let mut out = Vec::with_capacity(src.len());
        for c in src.chars() {
            match Self::try_from(c) {
                Ok(x) => out.push(x),
                Err(e) => return Err(e),
            }
        }
        Ok(out)
    }
}

impl TryFrom<char> for Level {
    type Error = MyError;

    fn try_from(value: char) -> Result<Self, MyError> {
        match value {
            'd' => Ok(Self::Debug),
            'i' => Ok(Self::Info),
            'w' => Ok(Self::Warning),
            'e' => Ok(Self::Error),
            _ => Err(MyError),
        }
    }
}


#[derive(Debug, Parser)]
#[command()]
struct Command {
    #[arg(long, value_parser = Level::from_str)]
    pub levels: Vec<Level>,
}

fn main() {
    let cmd = Command::parse();
    println!("Hello --> {:?}", cmd);
}

Running with --levels we fails at run time with the error #4830, since value_parser = Level::from_str applies to the elements, not the whole Vec<Level> as intended. Workaround is to denote Command::levels as the long-form std::vec::Vec<Level>. Functional, but probably not the ideal user experience.

@dzmitry-lahoda

This comment was marked as off-topic.

@epage

This comment was marked as off-topic.

@dzmitry-lahoda

This comment was marked as off-topic.

@dzmitry-lahoda

This comment was marked as off-topic.

@epage

This comment was marked as off-topic.

@dzmitry-lahoda

This comment was marked as off-topic.

@epage

This comment was marked as off-topic.

koxu1996 added a commit to cspr-rad/kairos that referenced this issue Feb 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-derive Area: #[derive]` macro API C-enhancement Category: Raise on the bar on expectations S-waiting-on-design Status: Waiting on user-facing design to be resolved before implementing
Projects
None yet
Development

No branches or pull requests

5 participants