Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[v2.12 Regression] Asterisk argument mising-param-doc vs anomalous-backslash-in-string #5406

Closed
kevinoid opened this issue Nov 26, 2021 · 24 comments · Fixed by #5464 or #6212
Closed
Assignees
Labels
False Positive 🦟 A message is emitted but nothing is wrong with the code Regression
Milestone

Comments

@kevinoid
Copy link

kevinoid commented Nov 26, 2021

Bug description

Take the following function:

def echo(*args):
    """
    Print the given arguments.
    """
    print(*args)

The args parameter can be added to the docstring (in Sphinx style) in 3 ways:

  1. :param args: Arguments to print.
    This works with sphinx and pylint 2.11.1 and earlier (before Fix asterisks parsing of mising-param-doc #5175). With pylint 2.12.0 and later it causes the following errors

    echo.py:4:0: W9015: "*args" missing in parameter documentation (missing-param-doc)
    echo.py:4:0: W9017: "args" differing in parameter documentation (differing-param-doc)

  2. :param *args: Arguments to print.
    This works with pylint, but causes the following error with sphinx:

    echo.py:docstring of echo.echo:3:Inline emphasis start-string without end-string.

  3. :param \*args: Arguments to print.
    This works with sphinx, but causes the following error with pylint:

    W1401: Anomalous backslash in string: '*'. String constant might be missing an r prefix. (anomalous-backslash-in-string)

Configuration

[MASTER]
load-plugins=pylint.extensions.docparams

Command used

pylint echo.py

Pylint output

For method 1 (with pylint 2.12.1):

echo.py:4:0: W9015: "*args" missing in parameter documentation (missing-param-doc)
echo.py:4:0: W9017: "args" differing in parameter documentation (differing-param-doc)

For method 2:

No errors.

For method 3:

echo.py:8:11: W1401: Anomalous backslash in string: '\*'. String constant might be missing an r prefix. (anomalous-backslash-in-string)

Expected behavior

I would expect pylint to accept the syntaxes that Sphinx does (1 and 3 from above).

I'd prefer if pylint rejected syntaxes which Sphinx rejects (2 from above). However, since Sphinx accepts asterisk-without-backslash for Google- and NumPy-style docstrings (as requested in #3733) if these can't be easily distinguished by pylint, it'd probably be preferable to accept them all.

Note: This issue has affected other linters and it was noted that sphinx.ext.napoleon (which adds support for NumPy and Google style docstrings) auto-escapes * at the beginning of arguments: peterjc/flake8-rst-docstrings#18 (comment) This is likely why those styles differ from Sphinx style in what is accepted.

Pylint version

pylint 2.12.1
Sphinx 4.3.0

OS / Environment

Debian GNU/Linux bookworm/sid

Additional dependencies

No response

@kevinoid kevinoid added Bug 🪲 Needs triage 📥 Just created, needs acknowledgment, triage, and proper labelling labels Nov 26, 2021
@DanielNoord
Copy link
Collaborator

Thanks for the detailed report! I think this should be fixable as all styles have different checkers.

Do you have any good links on Sphinx style documentation I could link to in the docstring of the new test cases? I think the main cause of this regression was that I did not know what style required what and wasn't able what Sphinx requires. I know I found it quite difficult to find authoritative style guides for some of these edge cases for all three styles.

@DanielNoord DanielNoord self-assigned this Nov 26, 2021
@DanielNoord DanielNoord added False Positive 🦟 A message is emitted but nothing is wrong with the code Regression and removed Bug 🪲 Needs triage 📥 Just created, needs acknowledgment, triage, and proper labelling labels Nov 26, 2021
@kevinoid
Copy link
Author

Thanks for the fast response @DanielNoord!

Do you have any good links on Sphinx style documentation [...] I know I found it quite difficult to find authoritative style guides for some of these edge cases for all three styles.

You and me both! Unfortunately, I don't. I actually changed my docstrings to the second syntax before I realized sphinx-build would complain. I'll investigate a bit and open a docs issue with Sphinx if I can't find any.

@kevinoid
Copy link
Author

I'll investigate a bit and open a docs issue with Sphinx if I can't find any.

No luck. I've opened sphinx-doc/sphinx#9893.

@DanielNoord
Copy link
Collaborator

Thanks! Let's await their response before applying a fix here. Perhaps they will also start allowing it, that would make sense to me.

@mwchase
Copy link

mwchase commented Nov 30, 2021

I think I encountered this. The codebase in question has other checkers, so I had to add r to the string and use \**kwargs. This raises just W9015, and not W9017, so there's some kind of underlying asymmetry there.

@avylove
Copy link
Contributor

avylove commented Dec 1, 2021

To expand on what @mwchase said

We originally had kwargs without the prepended **

def wrap(self, text, width=None, **kwargs):
  """
  :arg kwargs: See :py:class:`textwrap.TextWrapper`
  """

With 2.12.1 this raises

blessed/terminal.py:1155: [W9015(missing-param-doc), Terminal.wrap] "**kwargs" missing in parameter documentation
blessed/terminal.py:1155: [W9017(differing-param-doc), Terminal.wrap] "kwargs" differing in parameter documentation

I think this is a style choice as to if the asterisks should be included and that probably needs a Boolean option. But I don't have a problem including them.

The correct way to fix this is a single backslash to tell Sphinx the asterisks are literal and not formatting and the r prefix because pydocstyle requires it, which is appropriate.

def wrap(self, text, width=None, **kwargs):
  r"""
  :arg \**kwargs: See :py:class:`textwrap.TextWrapper`
  """

The problem is Pylint still raises

blessed/terminal.py:1155: [W9015(missing-param-doc), Terminal.wrap] "**kwargs" missing in parameter documentation

It looks like there's some inconsistency in how this is being checked and the backslash isn't being handled correctly in the parameter names.

@DanielNoord
Copy link
Collaborator

I think we should be able to make pylint recognize the \ as being an "escape character". pylint would then allow either **kwargs and \**kwargs.
Personally it makes sense to me to require the asterisks as otherwise the docstring doesn't provide that information (thus kwargs would still be disallowed).

Would this be an okay solution for all involved?

@avylove
Copy link
Contributor

avylove commented Dec 1, 2021

That works for me

@kevinoid
Copy link
Author

kevinoid commented Dec 1, 2021

I think we should be able to make pylint recognize the \ as being an "escape character". pylint would then allow either **kwargs and \**kwargs.

Since sphinx doesn't currently accept sphinx-style :param *args: without a backslash (it causes "Inline emphasis start-string without end-string.") I'd prefer if pylint did not accept it either, but it's not a big deal to me. Is the rationale for accepting it based on ease of implementation, or in hopes that sphinx will eventually support it as they do in Google and NumPy styles, or something else?

Requiring asterisks makes sense to me.

@mwchase
Copy link

mwchase commented Dec 1, 2021

I assume it works in Google and NumPy style because Napoleon can handle the escaping during pre-processing. Like, in https://github.com/sphinx-contrib/napoleon/blob/master/sphinxcontrib/napoleon/docstring.py#L333. If some extension could support the same rewrites in :param : and :type : context, then backslashes would be unnecessary and that would be a potential workaround.

@DanielNoord
Copy link
Collaborator

I think this should do the trick:

diff --git a/pylint/extensions/_check_docs_utils.py b/pylint/extensions/_check_docs_utils.py
index 18c2ec0b..8edde4f7 100644
--- a/pylint/extensions/_check_docs_utils.py
+++ b/pylint/extensions/_check_docs_utils.py
@@ -280,9 +280,9 @@ class SphinxDocstring(Docstring):
         \s+
         )?
 
-        (\*{{0,2}}\w+)          # Parameter name with potential asterisks
-        \s*                     # whitespace
-        :                       # final colon
+        (\\\*{1,2}\w+|(?<=\s)\w+)   # Parameter name with potential asterisks
+        \s*                         # whitespace
+        :                           # final colon
         """
     re_param_in_docstring = re.compile(re_param_raw, re.X | re.S)

I'm waiting on #5450 to be merged to make use of the functional test framework to test this. I think we should be able to still fit this in 2.12.2.

@fmigneault
Copy link

This does not look like it should be resolved yet.

Although \* and \** seem to be handled now, having the args and kwargs as is without asterisk has been and still is a very common and often employed method to document those arguments.
Because there is not a different warning than W9017 specifically for the asterisk handling, the only option to move beyond version 2.12 is to disable W9017 entirely, which could cause some real cases to be missed.

Most developers, including myself, will not want to update entire code bases to add * in docstrings just to please pylint.

@DanielNoord
Copy link
Collaborator

@fmigneault Although I agree that this might be a bit of work for any existing codebase, I do think that requiring ** or * before args or kwargs makes sense.

There is no real authority on this (and there has been no response to sphinx-doc/sphinx#9893), but there is an example within the Sphinx ecosystem which says that they should be there, see this example. The same goes for Numpy, which style guide states that they should be included here.

Therefore, the main docstyle guides (we support) seem to lean towards actually requiring them. It is a shame that pylint did not require them previously but I consider that a false negative: we should have warned about it and are doing so now.
Note that #5815 still needs to be resolved.

Obviously I am open to any counter-arguments as to why they shouldn't be required, but reducing the amount of false negatives (or introducing a new checker) will sadly always require some work from users that relied on pylint not correctly flagging those errors before.

@fmigneault
Copy link

this might be a bit of work

It is not just a little work, it is massive work. Every code base use *args and **kwags profusely. This is a lot of modifications for something that effectively changes nothing in the final result when tools are able to parse them regardless if * are present or not.

Given that * and ** are always handled as special cases across different tools/extensions and across the various documentation styles, I don't think it is a good idea to enforce a specific format when both were historically permitted interchangeably. The fact that some styles needs to be escaped with \ (eg: :param *args: ...) while other like Google style don't because they are nested under Arguments section makes enforcing * even more inconsistent.

At the very least, *args and **kwags should be handled separately than W9015/W9017 and raise a new warning. Something like indicating "argument should be prefixed by asterisk(s)". This would allow code bases to add this warning to ignore cases if we don't care, but currently we have to disable missing parameter warnings altogether, possibly causing more false negatives than before.

@DanielNoord
Copy link
Collaborator

It is not just a little work, it is massive work. Every code base use *args and **kwags profusely. This is a lot of modifications for something that effectively changes nothing in the final result when tools are able to parse them regardless if * are present or not.

Although I can understand that you would be hesitant to do so I think with some regular expressions and replace all you should be able to fix most cases (and then let pylint notify you of any you missed). For example, in the commit and project in which you mentioned this issue searching for param: kwargs and param: args should help automate some of this process.

Given that * and ** are always handled as special cases across different tools/extensions and across the various documentation styles, I don't think it is a good idea to enforce a specific format when both were historically permitted interchangeably. The fact that some styles needs to be escaped with \ (eg: :param *args: ...) while other like Google style don't because they are nested under Arguments section makes enforcing * even more inconsistent.

The necessity of escaping ** comes from Sphinx docstring extensions that interpret * and ** as .rst "keywords" (I believe).
That is the only special casing I see in the style guides I mentioned: both require them to be present, however for certain Sphinx extensions they need to be escaped.

At the very least, *args and **kwags should be handled separately than W9015/W9017 and raise a new warning. Something like indicating "argument should be prefixed by asterisk(s)". This would allow code bases to add this warning to ignore cases if we don't care, but currently we have to disable missing parameter warnings altogether, possibly causing more false negatives than before.

I think there is a reason why both previously mentioned style guides requires the * to be present: it tells you something about the type/form of the parameter. Perhaps somebody wants to take the time to see if they can separate the functionality into its own checker and we can review that PR, but I must say I'm hesitant to approve it. For me the * is part of the parameter name/description and should therefore be included in the docstring.
Just another thought, if the goal of the docstring is to document the function and its signature you would need the asterisks to be able to roundtrip to the actual signature.

@fmigneault
Copy link

For me the * is part of the parameter name/description and should therefore be included in the docstring.

I disagree with that. When the parameter is used in the code, only the parameter name itself is used without *.
The * and ** only indicates how to aggregate remaining parameters provided during the call, and store them in that named parameter.
So the parameter name can be seen from two contexts, how it is used in the call and how it is used in the code, but the * is a modifier operation on the parameter, not part of the name itself.
This subjectivity is perfectly reflected with how both conventions with/without * can be seen valid by different tools.

Many IDE (eg: PyCharm) explicitly warns about missing parameters in docstrings when adding the *. This demonstrates that was expected is entirely a personal choice by different tools, sphinx extensions, etc., and given that, like it was mentioned "There is no real authority on this", pylint shouldn't take a specific stance on which approach to chose, leaving the option to use either naming conventions.

@DanielNoord
Copy link
Collaborator

Many IDE (eg: PyCharm) explicitly warns about missing parameters in docstrings when adding the *. This demonstrates that was expected is entirely a personal choice by different tools, sphinx extensions, etc., and given that, like it was mentioned "There is no real authority on this", pylint shouldn't take a specific stance on which approach to chose, leaving the option to use either naming conventions.

This would be a reason for me to allow no-asterisks: incompatibility with IDEs is indeed a problem.

@Pierre-Sassoulas also uses PyCharm. Do you see the same problem?

@Pierre-Sassoulas
Copy link
Member

Pierre-Sassoulas commented Mar 25, 2022

Pycharm raises

  • nothing for args
  • Unexpected parameter *args in docstring for *args
  • Unexpected parameter \*args in docstring and PEP 8: W605 invalid escape sequence '\*' (embedded pycodestyle?) for \*args.

To be honest I'm not using this style of docstring myself which is why I did not realize this before.

@vincentschut
Copy link

vincentschut commented Mar 28, 2022

Chiming in as a regular pylint user: we here (@ satelligence) would also appreciate it if the possibility to omit the */** from the docstrings would return. Either optionally, or by default. Until this is resolved, we'll probably stick to pylint 2.11.x

@DanielNoord DanielNoord reopened this Mar 28, 2022
@DanielNoord
Copy link
Collaborator

Sorry @vincentschut and @fmigneault I should have indeed reopened this.

I'm adding this to 2.14 to make sure we get to it before releasing a new minor version.

I'm quite busy myself, so I won't get to this soon (and won't be able to get it in a patch version). But if someone is willing to open a PR I think we can review this quickly and get it in 2.13.3.

@DanielNoord DanielNoord modified the milestones: 2.12.2, 2.14.0 Mar 28, 2022
@vincentschut
Copy link

Thanks for reopening.
If it is as simple as replacing line 272 in _check_doc_utils.py from ((\\\*{{1,2}}\w+)|(\w+)) # Parameter name with potential asterisks to ((\\\*{{0,2}}\w+)|(\w+)) # Parameter name with potential asterisks, I can make a PR. If there's more to it, I'll probably rather wait for 2.14...

@DanielNoord
Copy link
Collaborator

I think that should do the trick. We just need a test as well, which can probably be based on some of the other tests. Searching for *args in the test directory should point you to the current test files I think.

@Pierre-Sassoulas Pierre-Sassoulas modified the milestones: 2.14.0, 2.13.3, 2.13.4, 2.13.5 Mar 28, 2022
@adamcunnington-mlg
Copy link

Hi there, please can you confirm what the usage will be following this issue - I've struggled to follow the conversation exactly.

How should we document *args & **kwargs following this release?

@DanielNoord
Copy link
Collaborator

DanielNoord commented Apr 7, 2022

Hi there, please can you confirm what the usage will be following this issue - I've struggled to follow the conversation exactly.

How should we document *args & **kwargs following this release?

I have just opened a PR with a fix for the issues discussed here.
After that has been merged and released in a patch release we should now accept:
\**kwargs, **kwargs and kwargs.

@Pierre-Sassoulas Pierre-Sassoulas added this to the 2.13.6 milestone Apr 7, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
False Positive 🦟 A message is emitted but nothing is wrong with the code Regression
Projects
None yet
8 participants