Skip to content

blyxxyz/lexopt

Repository files navigation

Lexopt

Crates.io API reference MSRV CI

Lexopt is an argument parser for Rust. It tries to have the simplest possible design that's still correct. It's so simple that it's a bit tedious to use.

Lexopt is:

  • Small: one file, no dependencies, no macros. Easy to audit or vendor.
  • Correct: standard conventions are supported and ambiguity is avoided. Tested and fuzzed.
  • Pedantic: arguments are returned as OsStrings, forcing you to convert them explicitly. This lets you handle badly-encoded filenames.
  • Imperative: options are returned as they are found, nothing is declared ahead of time.
  • Minimalist: only basic functionality is provided.
  • Unhelpful: there is no help generation and error messages often lack context.

Example

struct Args {
    thing: String,
    number: u32,
    shout: bool,
}

fn parse_args() -> Result<Args, lexopt::Error> {
    use lexopt::prelude::*;

    let mut thing = None;
    let mut number = 1;
    let mut shout = false;
    let mut parser = lexopt::Parser::from_env();
    while let Some(arg) = parser.next()? {
        match arg {
            Short('n') | Long("number") => {
                number = parser.value()?.parse()?;
            }
            Long("shout") => {
                shout = true;
            }
            Value(val) if thing.is_none() => {
                thing = Some(val.string()?);
            }
            Long("help") => {
                println!("Usage: hello [-n|--number=NUM] [--shout] THING");
                std::process::exit(0);
            }
            _ => return Err(arg.unexpected()),
        }
    }

    Ok(Args {
        thing: thing.ok_or("missing argument THING")?,
        number,
        shout,
    })
}

fn main() -> Result<(), lexopt::Error> {
    let args = parse_args()?;
    let mut message = format!("Hello {}", args.thing);
    if args.shout {
        message = message.to_uppercase();
    }
    for _ in 0..args.number {
        println!("{}", message);
    }
    Ok(())
}

Let's walk through this:

  • We start parsing with Parser::from_env().
  • We call parser.next() in a loop to get all the arguments until they run out.
  • We match on arguments. Short and Long indicate an option.
  • To get the value that belongs to an option (like 10 in -n 10) we call parser.value().
    • This returns a standard OsString.
      • For convenience, use lexopt::prelude::* adds a .parse() method, analogous to the one on &str.
    • Calling parser.value() is how we tell Parser that -n takes a value at all.
  • Value indicates a free-standing argument.
    • if thing.is_none() is a useful pattern for positional arguments. If we already found thing we pass it on to another case.
    • It also contains an OsString.
      • The .string() method decodes it into a plain String.
  • If we don't know what to do with an argument we use return Err(arg.unexpected()) to turn it into an error message.
  • Strings can be promoted to errors for custom error messages.

This covers most of the functionality in the library. Lexopt does very little for you.

For a larger example with useful patterns, see examples/cargo.rs.

Command line syntax

The following conventions are supported:

  • Short options (-q)
  • Long options (--verbose)
  • -- to mark the end of options
  • = to separate options from values (--option=value, -o=value)
  • Spaces to separate options from values (--option value, -o value)
  • Unseparated short options (-ovalue)
  • Combined short options (-abc to mean -a -b -c)
  • Options with optional arguments (like GNU sed's -i, which can be used standalone or as -iSUFFIX) (Parser::optional_value())
  • Options with multiple arguments (Parser::values())

These are not supported out of the box:

  • Single-dash long options (like find's -name)
  • Abbreviated long options (GNU's getopt lets you write --num instead of --number if it can be expanded unambiguously)

Parser::raw_args() and Parser::try_raw_args() provide an escape hatch for consuming the original command line. This can be used for custom syntax, like treating -123 as a number instead of a string of options. See examples/nonstandard.rs for an example of this.

Unicode

This library supports unicode while tolerating non-unicode arguments.

Short options may be unicode, but only a single codepoint (a char).

Options can be combined with non-unicode arguments. That is, --option=��� will not cause an error or mangle the value.

Options themselves are patched as by String::from_utf8_lossy if they're not valid unicode. That typically means you'll raise an error later when they're not recognized.

Why?

For a particular application I was looking for a small parser that's pedantically correct. There are other compact argument parsing libraries, but I couldn't find one that handled OsStrings and implemented all the fiddly details of the argument syntax faithfully.

This library may also be useful if a lot of control is desired, like when the exact argument order matters or not all options are known ahead of time. It could be considered more of a lexer than a parser.

Why not?

This library may not be worth using if:

  • You don't care about non-unicode arguments
  • You don't care about exact compliance and correctness
  • You don't care about code size
  • You do care about great error messages
  • You hate boilerplate

See also