Skip to content
This repository has been archived by the owner on Jan 1, 2022. It is now read-only.

Provide data in the application domain rather than the CLI domain #196

Open
4 tasks
epage opened this issue Dec 6, 2021 · 6 comments
Open
4 tasks

Provide data in the application domain rather than the CLI domain #196

epage opened this issue Dec 6, 2021 · 6 comments

Comments

@epage
Copy link
Owner

epage commented Dec 6, 2021

Issue by pksunkara
Friday Aug 13, 2021 at 01:05 GMT
Originally opened as clap-rs/clap#2683


Currently, clap's API is focused on the CLI domain, with data being tracked as OsString with specialized helpers for String. At the tale-end of the builder API, we provide the value_of_t API. We do gloss over this in the derive API.

Problems with the current approach

  • Bloat of functions in ArgMatches to handle every case
  • It is easy to use _os functions but disallow non-UTF8 during validation (fix: Provide path to avoid UTF-8 panics clap-rs/clap#2677)
  • Derive API parses data twice, once as part of validation and then when giving it to the user (#2206)
  • Most builder users won't bother with the extra development effort of validating like the derive API and instead will not get first class clap errors for their parse errors

See also clap-rs/clap#2832

Roll out

  1. Change the data structure for ArgMatches to store Box<Any + Send + Sync + 'static>
  2. Add a generic getter, gated behind a feature flag
  3. Add a builder function for setting a custom value, gated behind a feature flag
  4. Change the default parser from OsString to String, deprecating AllowInvalidUtf8, gated behind a feature flag
  5. Mark old API deprecated, gated behind a feature flag
  6. On next breaking release, remove deprecated APIs and remove feature gates

Other considerations / open questions

  • For multiple_values, users also have to set default_values, value_names, and value_hints, should we follow possible_values and avoid users passing in parallel arrays?
  • How do we handle default values?
    • Users will want to set their defaults according to their application data type
    • To show a default in help, we need it to be Display
    • clap_derive currently punts on this by providing both a default_value and default_value_t where the latter has to support Display and it gets forwarded to default_value, see feat(derive): Specify defaults by native expressions clap-rs/clap#2635
  • How many feature gates and when do we remove them?
    • The sooner its public, the better, for collecting feedback
    • The more people using it, the more impacted by subtle behavior changes (default parser)
      • We could put in hacks to make the old API work along side the new API and set the default parser according to AllowInvalidUtf8
  • How do we handle optional vs required, especially in light of #2505?
    • We have to deal with present / not-present, invalid cast of Any, etc.
  • For the parser, how do we balance flexibility and performance
    • We could make the parser a trait ValueParser and provide a struct BoxedValueParser that contains an enum for common function-pointer signatures (wrapped with converters for the error types) and people can explicitly construct a BoxedValueParser for their custom type
  • In designing the API, we should keep in mind that people are also wanting access to what authored a value (env, default, CLI) (TODO Find Issue)
    • Unsure how this feature would even fit into clap_derive
  • Long term, we will want to morph the ArgMatches API to be more map-like
    • #1206
    • Having a remove function would allow clap_derive to move values, instead of clone them.
  • Long term, we need to consider other type-related features besides default_value:
    • ValueHint: if we can detect and set a default for Path that would provide a better default experience
    • value_name for named arguments: the default (of --name <name>) is redundant. If we could provide type related defaults (e.g. PATH for PathBuf, NUM for numbers), it would also provide a better default experience
    • See Inferring more meaning from types for derives clap-rs/clap#2637 for prior discussion

Original

Basically, there are two things that need to be solved here.

  • We should be keeping the parsed values during validation stage of parsing instead of throwing them away
  • We should be storing the types directly in ArgMatches

I have to dig into the code to write a longer description of issues with the current behaviour and design.

  • #2505
  • #2206
  • #2298
  • #2437

See also clap-rs/clap#2832

@epage
Copy link
Owner Author

epage commented Dec 6, 2021

Comment by epage
Saturday Oct 16, 2021 at 20:11 GMT


Sounds like this is related to clap-rs/clap#2832

@epage
Copy link
Owner Author

epage commented Dec 6, 2021

Comment by epage
Monday Oct 18, 2021 at 22:25 GMT


@pksunkara just captured in the Issue my thoughts from not being able to sleep last night :) I'm sure you also have some and would love to hear and see how we can synthesize the two.

The only linked issue I'm seeing this directly help with is #2206. I thought it would help with #2505 but it sounds like we just repeat the same problem after exploring the idea.

@epage
Copy link
Owner Author

epage commented Dec 6, 2021

Comment by pksunkara
Monday Oct 18, 2021 at 22:55 GMT


Looks good to me. I would actually make this the main product objective for 4.0. I will add more if needed once I get into this context later (which I still have in my todo)

@epage
Copy link
Owner Author

epage commented Dec 6, 2021

Comment by epage
Friday Nov 05, 2021 at 16:48 GMT


From clap-rs/clap#2832

@ssokolow

Hmm. My main concern there is that this line looks like it could easily become a footgun if not handled carefully:

Change the default parser from OsString to String, deprecating AllowInvalidUtf8, gated behind a feature flag.

Windows may make it difficult to get paths with un-paired surrogates, but I've found mojibake'd filenames to be easy on POSIX platforms... to the point where it led me to report a bug in Serde because a mojibake'd Japanese ID3 tag had put mojibake'd characters in media file names during auto-renaming and Serde doesn't complain if you want to put a PathBuf into a JSON file without explicitly specifying how to handle un-UTF8-able contents. (Was it ID3v1 that left the tag's encoding to be communicated out-of-band? I think Audacious Media Player actually has a browser-style encoding detector for that.)

@epage
Copy link
Owner Author

epage commented Dec 6, 2021

Comment by epage
Friday Nov 05, 2021 at 16:50 GMT


Hmm. My main concern there is that this line looks like it could easily become a footgun if not handled carefully:

With the proposed API, if you want a path, you'd be specifying it as a PathBuf and we would parse from OsStr. Just if you didn't specify any custom type, we'd be choosing String by default which seems to be what people want in most cases.

@epage
Copy link
Owner Author

epage commented Dec 6, 2021

Comment by ssokolow
Friday Nov 05, 2021 at 17:05 GMT


Ahh. By "if not handled carefully", I meant "this could turn into a footgun for end users if clap doesn't handle it carefully", and that should fit the definition of "clap handling it carefully" well enough.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

1 participant