Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace napoleon.iterators by simpler stack implementation #9856

Merged
merged 1 commit into from Jun 26, 2022

Conversation

anntzer
Copy link
Contributor

@anntzer anntzer commented Nov 16, 2021

Subject: Replace sphinx.ext.napoleon.iterators by simpler stack implementation.

Feature or Bugfix

  • Refactoring

Purpose

The "peekable iterator" API is not actually needed by napoleon, as all
elements are known from the beginning, so _line_iter can be readily
replaced by a stack-like object.

This tightens the public API and makes it easier to extract napoleon for
vendoring independently of sphinx (in third-party packages such as
https://pypi.org/project/defopt/).

@@ -54,6 +53,32 @@
_SINGLETONS = ("None", "True", "False", "Ellipsis")


class _Stack:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure the "Stack" is the best name for this purpose. AFAIK, it's usually used for the LIFO collection. But the lines are not so.

Copy link
Contributor Author

@anntzer anntzer Nov 16, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's LIFO if you see peek()ing as taking the last element and then putting it back. I don't have a strong opinion about the name; the main point is that the data structure needed here is basically just a list (using python's ability to easily pop from the list's end). Alternatively, I could have used a collections.deque to implement the data structure (keeping the lines in logical order and using popleft()), but that doesn't buy us much, requires using a less well-known class, and direct indexing would be a bit less efficient. But I don't have a strong opinion here...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd vote for collections.deque.

A

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, done.

@@ -161,7 +186,7 @@ def __init__(self, docstring: Union[str, List[str]], config: SphinxConfig = None
lines = docstring.splitlines()
else:
lines = docstring
self._line_iter = modify_iter(lines, modifier=lambda s: s.rstrip())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agreed that no reason to evaluate the modifier lazily. So +1 for using peek_iter instead of modify_iter.

@anntzer
Copy link
Contributor Author

anntzer commented Dec 20, 2021

Kindly bumping. I can resolve #9856 (comment) by switching to collections.deque if you prefer; no strong opinion here.

@anntzer
Copy link
Contributor Author

anntzer commented Apr 8, 2022

Kindly bumping again. Or if this has no chance of ever going in (as a "refactoring" PR), I'd rather know it now and close this; either way is fine, just please let me know.

@anntzer anntzer force-pushed the unit branch 2 times, most recently from edb89ff to 1dfc18e Compare June 22, 2022 12:08
@AA-Turner AA-Turner changed the base branch from master to 5.x June 22, 2022 18:37
@AA-Turner
Copy link
Member

@anntzer I re-targeted to 5.x, please rebase your PR on the 5.x branch.

A

@anntzer
Copy link
Contributor Author

anntzer commented Jun 22, 2022

Done.

Copy link
Member

@AA-Turner AA-Turner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, two more things needed:

  • An entry in the list of deprecations
  • A test for the deprecation of the old iterator implementation.

A

@anntzer
Copy link
Contributor Author

anntzer commented Jun 25, 2022

Done :)

@AA-Turner
Copy link
Member

Thanks! Two of the lint jobs are failing it seems.

A

* - ``sphinx.ext.napoleon.iterators``
- 5.1
- 7.0
- ``pockets.iterators``
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is pockets.iterators? I think the chance of someone actually being impacted by this deprecation is vanishingly small, but wouldn't we recommend using the simpler method from this PR instead?

A

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's the original upstream version (https://github.com/RobRuana/pockets/blob/master/pockets/iterators.py) that got vendored into sphinx. It should be a drop-in replacement, so it seems to be easier to point any (likely rare) third party users to that module.

The "peekable iterator" API is not actually needed by napoleon, as all
elements are known from the beginning, so `_line_iter` can be readily
replaced by a stack-like object (actually implemented with a deque here,
to keep the lines in their physical order).

This tightens the public API and makes it easier to extract napoleon for
vendoring independently of sphinx.
@anntzer
Copy link
Contributor Author

anntzer commented Jun 25, 2022

Hopefully fixed now.

@AA-Turner
Copy link
Member

Thanks Antony!

A

@AA-Turner AA-Turner changed the title Replace sphinx.ext.napoleon.iterators by simpler stack implementation. Replace napoleon.iterators by simpler stack implementation Jun 26, 2022
@AA-Turner AA-Turner merged commit 03c1e1b into sphinx-doc:5.x Jun 26, 2022
@anntzer anntzer deleted the unit branch June 26, 2022 15:41
@AA-Turner
Copy link
Member

@anntzer it seems the bug reports about this implementation boil down to that something is causing the deque to exhaust, and then we try and pop another item from it.

Is this behaviour that happened in the previous implementation but was silenced? I want to release 5.1.1 with a fix fairly soon -- reverting this PR is an option, but I'd prefer to fix in place if possible.

A

@AA-Turner
Copy link
Member

back references to #10613; #10701

@anntzer
Copy link
Contributor Author

anntzer commented Jul 24, 2022

Apologies for the breakage, I can repro the issue at #10701. A simple fix is to mimic more closely the old modify_iter API, having a method that raises StopIteration instead of ValueError:

diff --git i/sphinx/ext/napoleon/docstring.py w/sphinx/ext/napoleon/docstring.py
index 21523ffb4..68c3d4f54 100644
--- i/sphinx/ext/napoleon/docstring.py
+++ w/sphinx/ext/napoleon/docstring.py
@@ -46,7 +46,11 @@ _SINGLETONS = ("None", "True", "False", "Ellipsis")
 
 
 class Deque(collections.deque):
-    """A subclass of deque with an additional `.Deque.get` method."""
+    """
+    A subclass of deque that mimics ``pockets.iterators.modify_iter``.
+
+    The `.Deque.get` and `.Deque.next` methods are added.
+    """
 
     sentinel = object()
 
@@ -57,6 +61,12 @@ class Deque(collections.deque):
         """
         return self[n] if n < len(self) else self.sentinel
 
+    def next(self) -> Any:
+        if self:
+            return super().popleft()
+        else:
+            raise StopIteration
+
 
 def _convert_type_spec(_type: str, translations: Dict[str, str] = {}) -> str:
     """Convert type specification to reference in reST."""
@@ -240,7 +250,7 @@ class GoogleDocstring:
         line = self._lines.get(0)
         while(not self._is_section_break() and
               (not line or self._is_indented(line, indent))):
-            lines.append(self._lines.popleft())
+            lines.append(self._lines.next())
             line = self._lines.get(0)
         return lines
 
@@ -249,20 +259,20 @@ class GoogleDocstring:
         while (self._lines and
                self._lines.get(0) and
                not self._is_section_header()):
-            lines.append(self._lines.popleft())
+            lines.append(self._lines.next())
         return lines
 
     def _consume_empty(self) -> List[str]:
         lines = []
         line = self._lines.get(0)
         while self._lines and not line:
-            lines.append(self._lines.popleft())
+            lines.append(self._lines.next())
             line = self._lines.get(0)
         return lines
 
     def _consume_field(self, parse_type: bool = True, prefer_type: bool = False
                        ) -> Tuple[str, str, List[str]]:
-        line = self._lines.popleft()
+        line = self._lines.next()
 
         before, colon, after = self._partition_field_on_colon(line)
         _name, _type, _desc = before, '', after
@@ -300,7 +310,7 @@ class GoogleDocstring:
         return fields
 
     def _consume_inline_attribute(self) -> Tuple[str, List[str]]:
-        line = self._lines.popleft()
+        line = self._lines.next()
         _type, colon, _desc = self._partition_field_on_colon(line)
         if not colon or not _desc:
             _type, _desc = _desc, _type
@@ -338,7 +348,7 @@ class GoogleDocstring:
         return lines
 
     def _consume_section_header(self) -> str:
-        section = self._lines.popleft()
+        section = self._lines.next()
         stripped_section = section.strip(':')
         if stripped_section.lower() in self._sections:
             section = stripped_section
@@ -347,14 +357,14 @@ class GoogleDocstring:
     def _consume_to_end(self) -> List[str]:
         lines = []
         while self._lines:
-            lines.append(self._lines.popleft())
+            lines.append(self._lines.next())
         return lines
 
     def _consume_to_next_section(self) -> List[str]:
         self._consume_empty()
         lines = []
         while not self._is_section_break():
-            lines.append(self._lines.popleft())
+            lines.append(self._lines.next())
         return lines + self._consume_empty()
 
     def _dedent(self, lines: List[str], full: bool = False) -> List[str]:
@@ -884,7 +894,7 @@ def _recombine_set_tokens(tokens: List[str]) -> List[str]:
         previous_token = None
         while True:
             try:
-                token = tokens.popleft()
+                token = tokens.next()
             except IndexError:
                 break
 
@@ -918,7 +928,7 @@ def _recombine_set_tokens(tokens: List[str]) -> List[str]:
     def combine_set(tokens):
         while True:
             try:
-                token = tokens.popleft()
+                token = tokens.next()
             except IndexError:
                 break
 
@@ -1170,7 +1180,7 @@ class NumpyDocstring(GoogleDocstring):
 
     def _consume_field(self, parse_type: bool = True, prefer_type: bool = False
                        ) -> Tuple[str, str, List[str]]:
-        line = self._lines.popleft()
+        line = self._lines.next()
         if parse_type:
             _name, _, _type = self._partition_field_on_colon(line)
         else:
@@ -1201,10 +1211,10 @@ class NumpyDocstring(GoogleDocstring):
         return self._consume_fields(prefer_type=True)
 
     def _consume_section_header(self) -> str:
-        section = self._lines.popleft()
+        section = self._lines.next()
         if not _directive_regex.match(section):
             # Consume the header underline
-            self._lines.popleft()
+            self._lines.next()
         return section
 
     def _is_section_break(self) -> bool:

It's a bit annoying because it'd be nice to know what's the place that's implicitly catching a StopIteration (likely in a for loop), but it's not so urgent to figure that out either.
Locally, this fixes #10701. How does that look to you? I can make a PR out of this patch if that looks good to you (or you can even take up the patch yourself if you want to go even faster).

@vfazio
Copy link

vfazio commented Jul 25, 2022

similar to the above a quick hack to Deque seemed to address the build for me:

    def popleft(self) -> Any:
        try:
            return super().popleft()
        except IndexError:
            raise StopIteration

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants