Fix bad formatting of error messages about EOF in multi-line statements #2343

tanvimoharir · 2021-06-20T14:57:09Z

tanvimoharir · 2021-06-20T14:58:56Z

src/black/parsing.py

+        except TokenError as te:
+            lineno, column = te.args[1]
+            lines = src_txt.splitlines()
+            try:
+                faulty_line = lines[lineno - 1]
+            except IndexError:
+                faulty_line = "<line number missing in source>"
+            exc = InvalidInput(f"Cannot parse: {lineno}:{column}: {faulty_line}")
+


I have tried to keep the format consistent with ParseError above but still need to correct the values for lineno,column and faulty_line (which are currently wrong)

I think we can do without the duplication of code by conditionalizing (made that up right now :p) the lineno. and col. information extraction code. So in the end it would look something like this (warning: untested):

try: result = drv.parse_string(src_txt, True) break except (ParseError, TokenError) as err: if isinstance(err, ParseError): lineno, column = err.context[1] else: lineno, column = err.args[1] lines = src_txt.splitlines() try: faulty_line = lines[lineno - 1] except IndexError: faulty_line = "<line number missing in source>" exc = InvalidInput(f"Cannot parse: {lineno}:{column}: {faulty_line}")

Thanks I will try this. I had initially thought of making changes here https://github.com/psf/black/blob/main/src/blib2to3/pgen2/tokenize.py#L177

ichard26

Hi there,

Thanks for the PR! It's great to see this getting closer to being fixed. I looked over your work and I have a few suggestions and requests to make, so let's get started:

It would be extra fantastic to catch and print out a prettier error message for IndentationError as well. The main difficulty would probably be getting the line no. and col. information out of the exception object since this exception is a built-in exception and has a different argument setup than TokenError (or ParseError) which are custom. Although I wouldn't mind deferring IndentationError because the current error message isn't that bad actually:

~ via Python v3.8.5 took 1m10s232ms 
123❯ cat test.py
def t():
    hello = "world"

  a = 123

~ via Python v3.8.5 
1❯ black test.py --check
error: cannot format test.py: unindent does not match any outer indentation level (<tokenize>, line 4)
Oh no! 💥 💔 💥
1 file would fail to reformat.

Here's the only place where IndentationError is raised in blib2to3:

black/src/blib2to3/pgen2/tokenize.py

Line 522 in 1bedc17

raise IndentationError(

Second thing, it would be awesome if a test ensuring TokenError doesn't hit the top-level exception handler would be added, although I wouldn't block on this.

Finally, you'll need a changelog entry in CHANGES.md that describes what your patch does. Don't forget to add your PR number to the end. While you're doing that, you can also add yourself to the AUTHORS.md file!

Oh, by the way, you can use special keywords in the PR description so GitHub will automatically close an issue upon merge. You can read more here: https://docs.github.com/en/github/managing-your-work-on-github/linking-a-pull-request-to-an-issue#linking-a-pull-request-to-an-issue-using-a-keyword. This makes sure any issue that should be closed are in fact closed (and I won't have to go proactively hunt for such issues!), keeping our very chaotic issue tracker ever so slightly less chaotic and scary :) Although, if this PR doesn't end up covering IndentationError, closing the issue might not be the right call.

Thanks for your work! We appreciate it!

ichard26 · 2021-06-21T22:19:12Z

src/black/parsing.py

+        except TokenError as te:
+            lineno, column = te.args[1]
+            lines = src_txt.splitlines()
+            try:
+                faulty_line = lines[lineno - 1]
+            except IndexError:
+                faulty_line = "<line number missing in source>"
+            exc = InvalidInput(f"Cannot parse: {lineno}:{column}: {faulty_line}")
+


I think we can do without the duplication of code by conditionalizing (made that up right now :p) the lineno. and col. information extraction code. So in the end it would look something like this (warning: untested):

try: result = drv.parse_string(src_txt, True) break except (ParseError, TokenError) as err: if isinstance(err, ParseError): lineno, column = err.context[1] else: lineno, column = err.args[1] lines = src_txt.splitlines() try: faulty_line = lines[lineno - 1] except IndexError: faulty_line = "<line number missing in source>" exc = InvalidInput(f"Cannot parse: {lineno}:{column}: {faulty_line}")

tanvimoharir · 2021-06-22T16:37:29Z

Hi there,

Thanks for the PR! It's great to see this getting closer to being fixed. I looked over your work and I have a few suggestions and requests to make, so let's get started:

It would be extra fantastic to catch and print out a prettier error message for IndentationError as well. The main difficulty would probably be getting the line no. and col. information out of the exception object since this exception is a built-in exception and has a different argument setup than TokenError (or ParseError) which are custom. Although I wouldn't mind deferring IndentationError because the current error message isn't that bad actually:
~ via Python v3.8.5 took 1m10s232ms 
123❯ cat test.py
def t():
    hello = "world"

  a = 123

~ via Python v3.8.5 
1❯ black test.py --check
error: cannot format test.py: unindent does not match any outer indentation level (<tokenize>, line 4)
Oh no! 💥 💔 💥
1 file would fail to reformat.
Here's the only place where IndentationError is raised in blib2to3:

black/src/blib2to3/pgen2/tokenize.py

Line 522 in 1bedc17

raise IndentationError(

Second thing, it would be awesome if a test ensuring TokenError doesn't hit the top-level exception handler would be added, although I wouldn't block on this.

Finally, you'll need a changelog entry in CHANGES.md that describes what your patch does. Don't forget to add your PR number to the end. While you're doing that, you can also add yourself to the AUTHORS.md file!

Oh, by the way, you can use special keywords in the PR description so GitHub will automatically close an issue upon merge. You can read more here: https://docs.github.com/en/github/managing-your-work-on-github/linking-a-pull-request-to-an-issue#linking-a-pull-request-to-an-issue-using-a-keyword. This makes sure any issue that should be closed are in fact closed (and I won't have to go proactively hunt for such issues!), keeping our very chaotic issue tracker ever so slightly less chaotic and scary :) Although, if this PR doesn't end up covering IndentationError, closing the issue might not be the right call.

Thanks for your work! We appreciate it!

Thanks @ichard26 I will check the changes which might be needed for IndentationError and I will also be adding tests.
Also to your point of updating CHANGES.md and AUTHORS.md,I thought I could do that once I'm done with all the other changes:)

tanvimoharir · 2021-06-25T09:36:20Z

src/black/parsing.py

+        except (TokenError, ParseError) as err:
+            if isinstance(err, ParseError):
+                lineno, column = err.context[1]
+            else:
+                lineno, column = err.args[1]


I'm trying to understand why i'm getting linenumber value as 2 for token error:

(black) bash-3.2$ python -m src.black --check --code "print(" error: cannot format <string>: Cannot parse: 2:0: <line number missing in source> (black) bash-3.2$ python -m src.black --check --code "print([)" error: cannot format <string>: Cannot parse: 1:7: print([)

Yeahhh that seems to be a bug with the blib2to3 library. We can just ignore that for this PR, it's better if we can just get this landed so while the error message won't be perfect, it should be better than error: cannot format <string>: ('EOF in multi-line statement', (2, 0)) :)

Although a problem is that the source isn't printed at all which is unfortunate since it would be quite useful. Perhaps it would be good to also include that it was an "unexpected EOF" error in the message?

Good catch tho!

tanvimoharir · 2021-07-21T15:47:53Z

@ichard26
Changed the error message for TokenError,

black) bash-3.2$ python -m src.black --check --code "print("
error: cannot format <string>: Cannot parse: 2:0: Unexpected EOF
(black) bash-3.2$ python -m src.black --check --code "print([)"
error: cannot format <string>: Cannot parse: 1:7: print([)

tanvimoharir · 2021-07-21T15:49:22Z

src/black/parsing.py

+
+        except TokenError as te:
+            lineno, column = te.args[1]
+            lines = src_txt.splitlines()
+            exc = InvalidInput(f"Cannot parse: {lineno}:{column}: Unexpected EOF")
+


I included it as a separate block since that seemed more readable to me instead of adding multiple if (isintance,TokenError) in the above block where we're handling ParseError

Fair enough, I agree the duplication isn't a problem here.

tanvimoharir · 2021-07-21T15:51:07Z

@ichard26 Per your comment above about adding a test to ensure TokenError doesn't hit the top-level exception handler
Could you please explain this a little more?

ichard26 · 2021-08-12T02:17:03Z

@ichard26 Per your comment above about adding a test to ensure TokenError doesn't hit the top-level exception handler

Sorry I took so long to get back to you on this. Basically I want a test which throws this input at Black and asserts that TokenError isn't the error thrown. Now depending on what level you do this at, running black at the command line (via BlackRunner) or calling black.format_str directly, how to check for that changes. At the command line level, checking the error output is your only option, but with black.format_str, using something fancier like pytest.raises(black.parsing.InvalidInput). There's some good examples of both options in tests/test_black.py. If you need more help or have questions, lemme know!

src/black/parsing.py

tanvimoharir · 2021-08-30T16:40:28Z

@ichard26 Per your comment above about adding a test to ensure TokenError doesn't hit the top-level exception handler

Sorry I took so long to get back to you on this. Basically I want a test which throws this input at Black and asserts that TokenError isn't the error thrown. Now depending on what level you do this at, running black at the command line (via BlackRunner) or calling black.format_str directly, how to check for that changes. At the command line level, checking the error output is your only option, but with black.format_str, using something fancier like pytest.raises(black.parsing.InvalidInput). There's some good examples of both options in tests/test_black.py. If you need more help or have questions, lemme know!

Checking tests/test_black.py.

tanvimoharir · 2021-09-08T08:42:15Z

@ichard26 Added two tests at

black/tests/test_black.py

Lines 2201 to 2219 in 813c8dd

    
               def test_code_with_unexpected_eof_error(self) -> None: 
        
                   """ 
        
                   Test that Unexpected EOF error is raised with invalid code 
        
                   """ 
        
                   code = "print(" 
        
                   args = ["--check", "--code", code] 
        
                   error_msg = "error: cannot format <string>: Cannot parse: 2:0: Unexpected EOF\n" 
        
                   result = CliRunner().invoke(black.main, args) 
        
                   self.compare_results(result, error_msg, 123) 
        
               def test_invalid_input_parsing_error(self) -> None: 
        
                   """ 
        
                   Test with invalid code which throws parsing error 
        
                   """ 
        
                   code = "print([)" 
        
                   args = ["--check", "--code", code] 
        
                   error_msg = f"error: cannot format <string>: Cannot parse: 1:7: {code}\n" 
        
                   result = CliRunner().invoke(black.main, args) 
        
                   self.compare_results(result, error_msg, 123)

Please let me know if these are okay or if you were expecting something different.
Thanks!

ichard26

@tanvimoharir hey sorry for such the long wait. The tests you added were good but the second one wasn't necessary as we already have tests that cover that area. I made the first (still relevant) test a bit more of an unit test so changes to --code don't affect it down the line.

Finally I tweaked the error message opting to reuse the message already stored in the TokenError, and fixed the merge conflict. Thank you so much and my apologies once again!

Adding TokenError to lib2to3 function

05e50d9

tanvimoharir commented Jun 20, 2021

View reviewed changes

ichard26 requested changes Jun 21, 2021

View reviewed changes

Merge branch 'psf:main' into bad-format-error-message

07734f0

tanvimoharir changed the title ~~Adding TokenError to lib2to3 function~~ Fix bad formatting of error messages about EOF in multi-line statements Jun 25, 2021

Removing redundent exception handling for TokenError

179f1ec

tanvimoharir commented Jun 25, 2021

View reviewed changes

tanvimoharir and others added 2 commits July 21, 2021 20:29

Merge branch 'psf:main' into bad-format-error-message

705e027

Changing message for TokenError

fc40194

tanvimoharir commented Jul 21, 2021

View reviewed changes

tanvimoharir requested a review from ichard26 July 21, 2021 15:51

ichard26 reviewed Aug 12, 2021

View reviewed changes

src/black/parsing.py Outdated Show resolved Hide resolved

Removing splitlines

b5a3a7b

Adding tests

813c8dd

tanvimoharir requested a review from ichard26 September 8, 2021 08:42

ichard26 added 3 commits December 4, 2021 14:59

Cleanup tests and improve error message

8ded011

Merge branch 'main' into bad-format-error-message

0b3d6db

Move changelog to the right spot

339a173

ichard26 marked this pull request as ready for review December 4, 2021 20:01

ichard26 approved these changes Dec 4, 2021

View reviewed changes

ichard26 merged commit f52cb0f into psf:main Dec 4, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix bad formatting of error messages about EOF in multi-line statements #2343

Fix bad formatting of error messages about EOF in multi-line statements #2343

tanvimoharir commented Jun 20, 2021 •

edited

tanvimoharir Jun 20, 2021

ichard26 Jun 21, 2021

tanvimoharir Jun 22, 2021

ichard26 left a comment •

edited

ichard26 Jun 21, 2021

tanvimoharir commented Jun 22, 2021

tanvimoharir Jun 25, 2021

ichard26 Jul 14, 2021

tanvimoharir commented Jul 21, 2021

tanvimoharir Jul 21, 2021 •

edited

ichard26 Aug 12, 2021

tanvimoharir commented Jul 21, 2021

ichard26 commented Aug 12, 2021

tanvimoharir commented Aug 30, 2021

tanvimoharir commented Sep 8, 2021

ichard26 left a comment

Fix bad formatting of error messages about EOF in multi-line statements #2343

Fix bad formatting of error messages about EOF in multi-line statements #2343

Conversation

tanvimoharir commented Jun 20, 2021 • edited

tanvimoharir Jun 20, 2021

Choose a reason for hiding this comment

ichard26 Jun 21, 2021

Choose a reason for hiding this comment

tanvimoharir Jun 22, 2021

Choose a reason for hiding this comment

ichard26 left a comment • edited

Choose a reason for hiding this comment

ichard26 Jun 21, 2021

Choose a reason for hiding this comment

tanvimoharir commented Jun 22, 2021

tanvimoharir Jun 25, 2021

Choose a reason for hiding this comment

ichard26 Jul 14, 2021

Choose a reason for hiding this comment

tanvimoharir commented Jul 21, 2021

tanvimoharir Jul 21, 2021 • edited

Choose a reason for hiding this comment

ichard26 Aug 12, 2021

Choose a reason for hiding this comment

tanvimoharir commented Jul 21, 2021

ichard26 commented Aug 12, 2021

tanvimoharir commented Aug 30, 2021

tanvimoharir commented Sep 8, 2021

ichard26 left a comment

Choose a reason for hiding this comment

tanvimoharir commented Jun 20, 2021 •

edited

ichard26 left a comment •

edited

tanvimoharir Jul 21, 2021 •

edited