Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow access to the parser during the read phase #12361

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

chrisjsewell
Copy link
Member

@chrisjsewell chrisjsewell commented May 7, 2024

The primary use case here is for within directives (also to a lesser extent roles).

Context:

To be fair to docutils, it never actually exposes directives as an extension point to users, this is done by sphinx (mainly in Sphinx.add_directive and Domain.directives)

The "problem" with this is that

  1. Directives are currently a pretty leaky abstraction; exposing a lot of the "internals" of the docutils rST parser, particularly when it comes to nested parsing or parsing from other sources (like docstrings).
  2. The docutils rst parser itself is not the greatest when it comes to things such as nested parsing, and also source mapped logging

This has started to lead to a proliferation of "hacks" into the rst parser, to try to overcom these shortcomings.
To name a few:

These are exceptionally difficult to reconcile withinin parser like Myst, or even alternative rST parsers

Allowing access to the parser would allow the possibility for this sort of code to be centralised and abstracted better on the parser implementation


Also somewhat related #12238 by @n-peugnet

Copy link
Member

@picnixz picnixz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In principle, I'm fine but does the publisher.parser object have a lot of references (meaning, is there any chance that we have circular references?)

@@ -485,6 +485,7 @@ def read_doc(self, docname: str, *, _cache: bool = True) -> None:
filename = self.env.doc2path(docname)
filetype = get_filetype(self.app.config.source_suffix, filename)
publisher = self.app.registry.get_publisher(self.app, filetype)
self.env.temp_data['parser'] = publisher.parser
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one might conflict an existing temp_data specified by the user somewhere else (unless temp_data is only meant for us). In particular, this requires a CHANGES entry.

I think we should improve the way we store temporary data and use namespaces to avoid conflicts.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed the key to _parser, which should minimise the risk of conflict, and exposed it as a property of the BuildEnvironment, similar to docname

Also added a CHANGES entry

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should improve the way we store temporary data and use namespaces to avoid conflicts.

well thats a topic for another day 😅

@chrisjsewell chrisjsewell requested a review from picnixz May 10, 2024 10:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants