Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handwritten parser for parsing requirements #484

Merged
merged 6 commits into from Jul 31, 2022
Merged

Conversation

hrnciar
Copy link
Contributor

@hrnciar hrnciar commented Nov 3, 2021

Hello,

I wrote a parser for packaging that should allow us to remove the dependency on pyparsing.
I am opening this as a draft as this is my first PR into packaging and I would love to hear feedback from you on what to improve in my PR.

Thank you.

Fixes: #468

@brettcannon
Copy link
Member

Two quick things (I haven't really reviewed any of the code).

One, looks like there are some CI failures.

Two, do we want packaging.parser and packaging.tokenizer to be public? They strike me as implementation details, so we would probably prefer to keep them as private (i.e. leading underscore in the module names).

@pradyunsg
Copy link
Member

+1 to both the things that Brett said.

@pradyunsg
Copy link
Member

That CI failures seem to be because of the use of := (sorry, you can't have nice things like that here) and because a few branches of the parser aren't covered in tests.

packaging/parser.py Outdated Show resolved Hide resolved
packaging/requirements.py Outdated Show resolved Hide resolved
packaging/requirements.py Outdated Show resolved Hide resolved
@pradyunsg
Copy link
Member

All that said, this is super cool and I'm very excited to see this progressing! Thanks for picking this up @hrnciar! ^>^

@hrnciar hrnciar force-pushed the parser branch 3 times, most recently from fe754f3 to 3062470 Compare November 5, 2021 15:22
packaging/_parser.py Outdated Show resolved Hide resolved
packaging/_parser.py Outdated Show resolved Hide resolved
@pradyunsg
Copy link
Member

Pulling this out of draft status, given there is a review request.

@pradyunsg pradyunsg marked this pull request as ready for review November 13, 2021 09:25
Copy link
Member

@pradyunsg pradyunsg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is pretty much a desk review -- I haven't cloned and poked at this yet; I'll do that once the linters are happy with this PR.

I don't think I have anything to say about the parsing logic other than that it looks correct, and that CI being happy is a good sign. :)

I think we can/should bundle this change with the legacy version removal, which... I'll pick back up once this lands.

packaging/_parser.py Outdated Show resolved Hide resolved
packaging/_parser.py Outdated Show resolved Hide resolved
packaging/_tokenizer.py Outdated Show resolved Hide resolved
packaging/_tokenizer.py Outdated Show resolved Hide resolved
packaging/_tokenizer.py Outdated Show resolved Hide resolved
packaging/_tokenizer.py Outdated Show resolved Hide resolved
packaging/_tokenizer.py Outdated Show resolved Hide resolved
packaging/_parser.py Outdated Show resolved Hide resolved
@pradyunsg
Copy link
Member

A gentle ping.

Comment on lines 35 to 36
MarkerType = Tuple[Union[Variable, Value], Op, Union[Variable, Value]]
# MarkerAtom = Union[MarkerType, List["MarkerAtom"]]
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These names are a bit confusing. Perhaps use MarkerAtom, MarkerItem, MarkerVar and MarkerList, corresponding with their parse rules?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I realized that I forget to rename the names just after I pushed it. I took into account your latest comment and pushed a new change, PTAL.

packaging/_parser.py Outdated Show resolved Hide resolved
@pradyunsg
Copy link
Member

(FYI: The CI isn't running due to merge conflicts with main)

Comment on lines 124 to 141
def raise_syntax_error(self, *, message: str) -> NoReturn:
"""Raise SyntaxError at the given position in the marker"""
at = f"at position {self.position}:"
marker = " " * self.position + "^"
raise ParseExceptionError(
f"{message}\n{at}\n {self.source}\n {marker}",
self.position,
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really like this! :)

@pradyunsg pradyunsg mentioned this pull request Jul 8, 2022
Copy link
Member

@brettcannon brettcannon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One of the few things that Black can't check syntactically, the docstrings unfortunately don't match the style in the rest of the project (both the newline structure and the sentences need to end with proper punctuation).

@hrnciar hrnciar force-pushed the parser branch 3 times, most recently from 5a961ae to 4f52bac Compare July 11, 2022 15:25
@hrnciar
Copy link
Contributor Author

hrnciar commented Jul 11, 2022

I've addressed some remarks, but there is still some stuff to do. I am heading to EuroPython tomorrow so I will continue next week when I am back. If anyone would like to meet in Dublin and have a chat IRL feel free to ping me :).


def parse_named_requirement(requirement: str) -> Requirement:
"""
named_requirement: IDENTIFIER extras (URL_SPEC | specifier) (END | SEMICOLON marker_expr END)?
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
named_requirement: IDENTIFIER extras (URL_SPEC | specifier) (END | SEMICOLON marker_expr END)?
named_requirement: IDENTIFIER extras (URL_SPEC | specifier) (SEMICOLON marker_expr )? END

Copy link
Member

@pradyunsg pradyunsg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, assuming @encukou's suggestion is applied!

And... you pacify flake8 -- in whatever manner you deem appropriate.

@hrnciar hrnciar force-pushed the parser branch 3 times, most recently from 6c8ed42 to a29816f Compare July 19, 2022 10:38
@hrnciar
Copy link
Contributor Author

hrnciar commented Jul 19, 2022

I've improved the docstrings and fixed a couple of nitpicks. Unless there is something else to improve, I think, we are done.

@hrnciar
Copy link
Contributor Author

hrnciar commented Jul 28, 2022

@brettcannon could you please do a review? Is there something specific I should improve? I'd like to finish this PR. Thank you.

@brettcannon
Copy link
Member

@hrnciar it's in my review queue, but if @pradyunsg wants to merge w/o a review for me then I am totally fine with that.

@pradyunsg pradyunsg merged commit 2e5593c into pypa:main Jul 31, 2022
@hroncok
Copy link
Contributor

hroncok commented Aug 4, 2022

Changelog entry: #581

@ofek
Copy link
Sponsor Contributor

ofek commented Sep 3, 2022

This is great for CLIs too 🙂 pypa/hatch#449 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Hand written parser for PEP 508 to drop dependency on pyparsing
8 participants