Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some Sphinx inventories cannot be parsed #496

Closed
lmichaelis opened this issue Nov 28, 2022 · 0 comments · Fixed by #497
Closed

Some Sphinx inventories cannot be parsed #496

lmichaelis opened this issue Nov 28, 2022 · 0 comments · Fixed by #497
Labels
unconfirmed This bug was not reproduced yet

Comments

@lmichaelis
Copy link
Contributor

lmichaelis commented Nov 28, 2022

Describe the bug

Some Sphinx inventories (see below) fail parsing with the following traceback:

Traceback
Traceback (most recent call last):
  File "C:\Users\lmichaelis\AppData\Local\Programs\Python\Python39\lib\runpy.py", line 197, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "C:\Users\lmichaelis\AppData\Local\Programs\Python\Python39\lib\runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "C:\Users\lmichaelis\xxx\.venv\Scripts\mkdocs.exe\__main__.py", line 7, in <module>
  File "c:\users\lmichaelis\xxx\.venv\lib\site-packages\click\core.py", line 1126, in __call__
    return self.main(*args, **kwargs)
  File "c:\users\lmichaelis\xxx\.venv\lib\site-packages\click\core.py", line 1051, in main
    rv = self.invoke(ctx)
  File "c:\users\lmichaelis\xxx\.venv\lib\site-packages\click\core.py", line 1657, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "c:\users\lmichaelis\xxx\.venv\lib\site-packages\click\core.py", line 1393, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "c:\users\lmichaelis\xxx\.venv\lib\site-packages\click\core.py", line 752, in invoke
    return __callback(*args, **kwargs)
  File "c:\users\lmichaelis\xxx\.venv\lib\site-packages\mkdocs\__main__.py", line 250, in build_command
    build.build(cfg, dirty=not clean)
  File "c:\users\lmichaelis\xxx\.venv\lib\site-packages\mkdocs\commands\build.py", line 311, in build
    env = config.plugins.run_event('env', env, config=config, files=files)
  File "c:\users\lmichaelis\xxx\.venv\lib\site-packages\mkdocs\plugins.py", line 520, in run_event
    result = method(item, **kwargs)
  File "c:\users\lmichaelis\xxx\.venv\lib\site-packages\mkdocstrings\plugin.py", line 242, in on_env
    for page, identifier in collections.ChainMap(*(fut.result() for fut in self._inv_futures)).items():
    for page, identifier in collections.ChainMap(*(fut.result() for fut in self._inv_futures)).items():
  File "C:\Users\lmichaelis\AppData\Local\Programs\Python\Python39\lib\concurrent\futures\_base.py", line 439, in result
  File "C:\Users\lmichaelis\AppData\Local\Programs\Python\Python39\lib\concurrent\futures\_base.py", line 391, in __get_result
    raise self._exception
  File "C:\Users\lmichaelis\AppData\Local\Programs\Python\Python39\lib\concurrent\futures\thread.py", line 58, in run
    result = self.fn(*self.args, **self.kwargs)
  File "c:\users\lmichaelis\xxx\.venv\lib\site-packages\mkdocstrings\plugin.py", line 299, in _load_inventory
    result = dict(loader(content, url=url, **kwargs))
  File "c:\users\lmichaelis\xxx\.venv\lib\site-packages\mkdocstrings_handlers\python\handler.py", line 189, in load_inventory
    for item in Inventory.parse_sphinx(in_file, domain_filter=("py",)).values():  # noqa: WPS526
  File "c:\users\lmichaelis\xxx\.venv\lib\site-packages\mkdocstrings\inventory.py", line 129, in parse_sphinx
    items = [InventoryItem.parse_sphinx(line.decode("utf8")) for line in lines]
  File "c:\users\lmichaelis\xxx\.venv\lib\site-packages\mkdocstrings\inventory.py", line 129, in <listcomp>
    items = [InventoryItem.parse_sphinx(line.decode("utf8")) for line in lines]
  File "c:\users\lmichaelis\xxx\.venv\lib\site-packages\mkdocstrings\inventory.py", line 56, in parse_sphinx
    raise ValueError(line)
ValueError: index std:doc -1  Werkzeug

Even though I encountered this error using the new Python handler this error occurs in mkdocstrings as the traceback shows.

To Reproduce

Steps to reproduce the behavior:

  1. Set up a basic mkdocstrings configuration
  2. Add any of the following inventories as an import to the Python handler
  3. Run mkdocs build or mkdocs serve
My Configuration
  - mkdocstrings:
     default_handler: python
     handlers:
       python:
         paths: ['.']
         import:
           - 'https://werkzeug.palletsprojects.com/en/2.0.x/objects.inv'
         options:
           show_root_heading: false
           docstring_style: 'google'
           show_source: true
           show_root_toc_entry: false
           heading_level: 3

Expected behavior

The inventory is downloaded, parsed and used to link to external documentation.

Information (please complete the following information):

  • OS: Microsoft Windows, Ubuntu Linux
  • Browser: n/a
  • mkdocstrings version: 0.19.0

Additional context

I have narrowed it down to the regex used in inventory.py:

sphinx_item_regex = re.compile(r"^(.+?)\s+(\S+):(\S+)\s+(-?\d+)\s+(\S+)\s+(.*)$")

It is used to validate one line of the inventory in parse_sphinx() where it fails to match the line:

@classmethod
def parse_sphinx(cls, line: str) -> "InventoryItem":
"""Parse a line from a Sphinx v2 inventory file and return an `InventoryItem` from it."""
match = cls.sphinx_item_regex.search(line)
if not match:
raise ValueError(line)
name, domain, role, priority, uri, dispname = match.groups()
if uri.endswith("$"):
uri = uri[:-1] + name
if dispname == "-":
dispname = name
return cls(name, domain, role, uri, priority, dispname)

Altering one character of the regex seems to fix it though I am not sure of any other effects it might have:

Before: ^(.+?)\s+(\S+):(\S+)\s+(-?\d+)\s+(\S+)\s+(.*)$
After:  ^(.+?)\s+(\S+):(\S+)\s+(-?\d+)\s+(\S+)\s*(.*)$
@lmichaelis lmichaelis added the unconfirmed This bug was not reproduced yet label Nov 28, 2022
pawamoy pushed a commit that referenced this issue Dec 13, 2022
Some Sphinx inventories don't match
the `sphinx_item_regex` defined in `InventoryItem`.
Allowing any number of whitespace characters at the end
instead of requiring at least one fixes this issue.

Co-authored-by: Luis Michaelis <luis.michaelis@iee.fraunhofer.de>
Issue #496: #496
PR #497: #497
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
unconfirmed This bug was not reproduced yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant