Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve Requirement/Marker parser with context-sensitive tokenisation #624

Merged
merged 23 commits into from
Dec 7, 2022

Commits on Dec 5, 2022

  1. Convert Token into a dataclass

    This makes it a fully-fleshed-out class for holding data.
    pradyunsg committed Dec 5, 2022
    Configuration menu
    Copy the full SHA
    2ceccfc View commit details
    Browse the repository at this point in the history
  2. Convert parser exception into a rich exception class

    This also pulls out the error message formatting logic into the error
    itself.
    pradyunsg committed Dec 5, 2022
    Configuration menu
    Copy the full SHA
    a25e85f View commit details
    Browse the repository at this point in the history
  3. Use a richer type for Tokenizer.rules

    This helps pyright better understand what's happening.
    pradyunsg committed Dec 5, 2022
    Configuration menu
    Copy the full SHA
    09f31ff View commit details
    Browse the repository at this point in the history
  4. Provide dedicated parse_{requirement,marker}(str) functions

    These provide a consistent call signatures into the parser. This also
    decouples the tokenizer from the `Marker` class.
    pradyunsg committed Dec 5, 2022
    Configuration menu
    Copy the full SHA
    1c930f1 View commit details
    Browse the repository at this point in the history
  5. Rename req to parsed in Requirement.__init__

    This makes it easier to read through the function, with a clearer name.
    pradyunsg committed Dec 5, 2022
    Configuration menu
    Copy the full SHA
    650c7c6 View commit details
    Browse the repository at this point in the history
  6. Rename parser's Requirement to ParsedRequirement

    This draws a clear distinction between this and the user-visible
    `Requirement` object.
    pradyunsg committed Dec 5, 2022
    Configuration menu
    Copy the full SHA
    282b4e1 View commit details
    Browse the repository at this point in the history
  7. Rework the parser with context-sensitive tokenisation

    This reduces how many regex patterns would be matched against the input
    while also enabling the parser to resolve ambiguity in-place.
    pradyunsg committed Dec 5, 2022
    Configuration menu
    Copy the full SHA
    07bf6f4 View commit details
    Browse the repository at this point in the history
  8. Parse markers inline when parsing requirements

    This allows for nicer error messages, which show the entire requirement
    string and highlight the marker in particular.
    pradyunsg committed Dec 5, 2022
    Configuration menu
    Copy the full SHA
    c6baf52 View commit details
    Browse the repository at this point in the history
  9. Factor out parsing semicolon-marker for requirements

    This eliminates a point of duplication and ensures that the error
    messaging is consistent.
    pradyunsg committed Dec 5, 2022
    Configuration menu
    Copy the full SHA
    6b2f3de View commit details
    Browse the repository at this point in the history
  10. Tweak the presentation of ParserSyntaxError spans

    This makes it easier to identify what position the parser was checking,
    compared to relevant context to the reader.
    pradyunsg committed Dec 5, 2022
    Configuration menu
    Copy the full SHA
    177e9ff View commit details
    Browse the repository at this point in the history
  11. Make URLs match "not whitespace"

    This is more permitting and better handles tabs used as whitespace.
    pradyunsg committed Dec 5, 2022
    Configuration menu
    Copy the full SHA
    1c3f900 View commit details
    Browse the repository at this point in the history
  12. Update IDENTIFIER to match PEP 508's stipulated syntax

    This follows what PEP 508's grammar says is a valid identifier.
    pradyunsg committed Dec 5, 2022
    Configuration menu
    Copy the full SHA
    4a4d835 View commit details
    Browse the repository at this point in the history
  13. Make arbitrary version matching accept what LegacySpecifier did

    This makes it possible for the arbitrary matches to be used within
    requirement specifiers without special constraints.
    pradyunsg committed Dec 5, 2022
    Configuration menu
    Copy the full SHA
    39ae524 View commit details
    Browse the repository at this point in the history
  14. Better reflect what is optional within specifier/version_many

    This makes it clearer in the docstring grammars that a name without
    any specifier is valid.
    pradyunsg committed Dec 5, 2022
    Configuration menu
    Copy the full SHA
    97e7649 View commit details
    Browse the repository at this point in the history
  15. Flatten nested ifs into if-elif

    This makes the control flow slightly easier to understand.
    pradyunsg committed Dec 5, 2022
    Configuration menu
    Copy the full SHA
    92b9545 View commit details
    Browse the repository at this point in the history
  16. Rewrite test suite for requirements parsing

    This now exercises more edge cases and validates that the error messages
    are well-formed.
    pradyunsg committed Dec 5, 2022
    Configuration menu
    Copy the full SHA
    3a7cdb6 View commit details
    Browse the repository at this point in the history

Commits on Dec 6, 2022

  1. Improve error message for bad version specifiers in Requirement

    This makes it easier to understand what the state of the parser is and
    what is expected at that point.
    pradyunsg committed Dec 6, 2022
    Configuration menu
    Copy the full SHA
    0399eaf View commit details
    Browse the repository at this point in the history
  2. Add ParserSyntaxError as the cause of Invalid{Requirement/Marker}

    This ensures that these error tracebacks correctly describe the
    causality between the two errors.
    pradyunsg committed Dec 6, 2022
    Configuration menu
    Copy the full SHA
    83aae66 View commit details
    Browse the repository at this point in the history

Commits on Dec 7, 2022

  1. Permit whitespace around marker_atom

    This ensures that a marker with whitespace around it is parsed
    correctly.
    pradyunsg committed Dec 7, 2022
    Configuration menu
    Copy the full SHA
    163993a View commit details
    Browse the repository at this point in the history
  2. Rename marker_expr to marker

    This is better aligned with the naming from PEP 508.
    pradyunsg committed Dec 7, 2022
    Configuration menu
    Copy the full SHA
    fa4b69d View commit details
    Browse the repository at this point in the history
  3. Enforce word boundaries in operators and names

    This ensures that these are only parsed when they're independent words.
    pradyunsg committed Dec 7, 2022
    Configuration menu
    Copy the full SHA
    ff75da7 View commit details
    Browse the repository at this point in the history
  4. Fix a typo in an error message

    The listed operators were incorrect.
    pradyunsg committed Dec 7, 2022
    Configuration menu
    Copy the full SHA
    4945856 View commit details
    Browse the repository at this point in the history
  5. Permit arbitrary whitespace around versions specifier in parenthesis

    This is more consistent with the rest of the format which is largely
    whitespace agnostic.
    pradyunsg committed Dec 7, 2022
    Configuration menu
    Copy the full SHA
    7869a1a View commit details
    Browse the repository at this point in the history