Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix bad formatting of error messages about EOF in multi-line statements #2343

Merged
merged 10 commits into from Dec 4, 2021
Merged

Fix bad formatting of error messages about EOF in multi-line statements #2343

merged 10 commits into from Dec 4, 2021

Conversation

tanvimoharir
Copy link
Contributor

@tanvimoharir tanvimoharir commented Jun 20, 2021

Closes #2317

Comment on lines 97 to 105
except TokenError as te:
lineno, column = te.args[1]
lines = src_txt.splitlines()
try:
faulty_line = lines[lineno - 1]
except IndexError:
faulty_line = "<line number missing in source>"
exc = InvalidInput(f"Cannot parse: {lineno}:{column}: {faulty_line}")

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have tried to keep the format consistent with ParseError above but still need to correct the values for lineno,column and faulty_line (which are currently wrong)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can do without the duplication of code by conditionalizing (made that up right now :p) the lineno. and col. information extraction code. So in the end it would look something like this (warning: untested):

        try:
            result = drv.parse_string(src_txt, True)
            break

        except (ParseError, TokenError) as err:
            if isinstance(err, ParseError):
                lineno, column = err.context[1]
            else:
                lineno, column = err.args[1]

            lines = src_txt.splitlines()
            try:
                faulty_line = lines[lineno - 1]
            except IndexError:
                faulty_line = "<line number missing in source>"
            exc = InvalidInput(f"Cannot parse: {lineno}:{column}: {faulty_line}")

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks I will try this. I had initially thought of making changes here https://github.com/psf/black/blob/main/src/blib2to3/pgen2/tokenize.py#L177

Copy link
Collaborator

@ichard26 ichard26 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi there,

Thanks for the PR! It's great to see this getting closer to being fixed. I looked over your work and I have a few suggestions and requests to make, so let's get started:

It would be extra fantastic to catch and print out a prettier error message for IndentationError as well. The main difficulty would probably be getting the line no. and col. information out of the exception object since this exception is a built-in exception and has a different argument setup than TokenError (or ParseError) which are custom. Although I wouldn't mind deferring IndentationError because the current error message isn't that bad actually:

~ via Python v3.8.5 took 1m10s232ms 
123❯ cat test.py
def t():
    hello = "world"

  a = 123

~ via Python v3.8.5 
1❯ black test.py --check
error: cannot format test.py: unindent does not match any outer indentation level (<tokenize>, line 4)
Oh no! 💥 💔 💥
1 file would fail to reformat.

Here's the only place where IndentationError is raised in blib2to3:

raise IndentationError(

Second thing, it would be awesome if a test ensuring TokenError doesn't hit the top-level exception handler would be added, although I wouldn't block on this.

Finally, you'll need a changelog entry in CHANGES.md that describes what your patch does. Don't forget to add your PR number to the end. While you're doing that, you can also add yourself to the AUTHORS.md file!

Oh, by the way, you can use special keywords in the PR description so GitHub will automatically close an issue upon merge. You can read more here: https://docs.github.com/en/github/managing-your-work-on-github/linking-a-pull-request-to-an-issue#linking-a-pull-request-to-an-issue-using-a-keyword. This makes sure any issue that should be closed are in fact closed (and I won't have to go proactively hunt for such issues!), keeping our very chaotic issue tracker ever so slightly less chaotic and scary :) Although, if this PR doesn't end up covering IndentationError, closing the issue might not be the right call.

Thanks for your work! We appreciate it!

Comment on lines 97 to 105
except TokenError as te:
lineno, column = te.args[1]
lines = src_txt.splitlines()
try:
faulty_line = lines[lineno - 1]
except IndexError:
faulty_line = "<line number missing in source>"
exc = InvalidInput(f"Cannot parse: {lineno}:{column}: {faulty_line}")

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can do without the duplication of code by conditionalizing (made that up right now :p) the lineno. and col. information extraction code. So in the end it would look something like this (warning: untested):

        try:
            result = drv.parse_string(src_txt, True)
            break

        except (ParseError, TokenError) as err:
            if isinstance(err, ParseError):
                lineno, column = err.context[1]
            else:
                lineno, column = err.args[1]

            lines = src_txt.splitlines()
            try:
                faulty_line = lines[lineno - 1]
            except IndexError:
                faulty_line = "<line number missing in source>"
            exc = InvalidInput(f"Cannot parse: {lineno}:{column}: {faulty_line}")

@tanvimoharir
Copy link
Contributor Author

Hi there,

Thanks for the PR! It's great to see this getting closer to being fixed. I looked over your work and I have a few suggestions and requests to make, so let's get started:

It would be extra fantastic to catch and print out a prettier error message for IndentationError as well. The main difficulty would probably be getting the line no. and col. information out of the exception object since this exception is a built-in exception and has a different argument setup than TokenError (or ParseError) which are custom. Although I wouldn't mind deferring IndentationError because the current error message isn't that bad actually:

~ via Python v3.8.5 took 1m10s232ms 
123❯ cat test.py
def t():
    hello = "world"

  a = 123

~ via Python v3.8.5 
1❯ black test.py --check
error: cannot format test.py: unindent does not match any outer indentation level (<tokenize>, line 4)
Oh no! 💥 💔 💥
1 file would fail to reformat.

Here's the only place where IndentationError is raised in blib2to3:

raise IndentationError(

Second thing, it would be awesome if a test ensuring TokenError doesn't hit the top-level exception handler would be added, although I wouldn't block on this.

Finally, you'll need a changelog entry in CHANGES.md that describes what your patch does. Don't forget to add your PR number to the end. While you're doing that, you can also add yourself to the AUTHORS.md file!

Oh, by the way, you can use special keywords in the PR description so GitHub will automatically close an issue upon merge. You can read more here: https://docs.github.com/en/github/managing-your-work-on-github/linking-a-pull-request-to-an-issue#linking-a-pull-request-to-an-issue-using-a-keyword. This makes sure any issue that should be closed are in fact closed (and I won't have to go proactively hunt for such issues!), keeping our very chaotic issue tracker ever so slightly less chaotic and scary :) Although, if this PR doesn't end up covering IndentationError, closing the issue might not be the right call.

Thanks for your work! We appreciate it!

Thanks @ichard26 I will check the changes which might be needed for IndentationError and I will also be adding tests.
Also to your point of updating CHANGES.md and AUTHORS.md,I thought I could do that once I'm done with all the other changes:)

@tanvimoharir tanvimoharir changed the title Adding TokenError to lib2to3 function Fix bad formatting of error messages about EOF in multi-line statements Jun 25, 2021
Comment on lines 88 to 92
except (TokenError, ParseError) as err:
if isinstance(err, ParseError):
lineno, column = err.context[1]
else:
lineno, column = err.args[1]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm trying to understand why i'm getting linenumber value as 2 for token error:

(black) bash-3.2$ python -m src.black --check --code "print("
error: cannot format <string>: Cannot parse: 2:0: <line number missing in source>
(black) bash-3.2$ python -m src.black --check --code "print([)" 
error: cannot format <string>: Cannot parse: 1:7: print([)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeahhh that seems to be a bug with the blib2to3 library. We can just ignore that for this PR, it's better if we can just get this landed so while the error message won't be perfect, it should be better than error: cannot format <string>: ('EOF in multi-line statement', (2, 0)) :)

Although a problem is that the source isn't printed at all which is unfortunate since it would be quite useful. Perhaps it would be good to also include that it was an "unexpected EOF" error in the message?

Good catch tho!

@tanvimoharir
Copy link
Contributor Author

@ichard26
Changed the error message for TokenError,

black) bash-3.2$ python -m src.black --check --code "print("
error: cannot format <string>: Cannot parse: 2:0: Unexpected EOF
(black) bash-3.2$ python -m src.black --check --code "print([)"
error: cannot format <string>: Cannot parse: 1:7: print([)

Comment on lines 96 to 101

except TokenError as te:
lineno, column = te.args[1]
lines = src_txt.splitlines()
exc = InvalidInput(f"Cannot parse: {lineno}:{column}: Unexpected EOF")

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I included it as a separate block since that seemed more readable to me instead of adding multiple if (isintance,TokenError) in the above block where we're handling ParseError

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough, I agree the duplication isn't a problem here.

@tanvimoharir
Copy link
Contributor Author

@ichard26 Per your comment above about adding a test to ensure TokenError doesn't hit the top-level exception handler
Could you please explain this a little more?

@ichard26
Copy link
Collaborator

@ichard26 Per your comment above about adding a test to ensure TokenError doesn't hit the top-level exception handler

Sorry I took so long to get back to you on this. Basically I want a test which throws this input at Black and asserts that TokenError isn't the error thrown. Now depending on what level you do this at, running black at the command line (via BlackRunner) or calling black.format_str directly, how to check for that changes. At the command line level, checking the error output is your only option, but with black.format_str, using something fancier like pytest.raises(black.parsing.InvalidInput). There's some good examples of both options in tests/test_black.py. If you need more help or have questions, lemme know!

src/black/parsing.py Outdated Show resolved Hide resolved
@tanvimoharir
Copy link
Contributor Author

@ichard26 Per your comment above about adding a test to ensure TokenError doesn't hit the top-level exception handler

Sorry I took so long to get back to you on this. Basically I want a test which throws this input at Black and asserts that TokenError isn't the error thrown. Now depending on what level you do this at, running black at the command line (via BlackRunner) or calling black.format_str directly, how to check for that changes. At the command line level, checking the error output is your only option, but with black.format_str, using something fancier like pytest.raises(black.parsing.InvalidInput). There's some good examples of both options in tests/test_black.py. If you need more help or have questions, lemme know!

Checking tests/test_black.py.

@tanvimoharir
Copy link
Contributor Author

@ichard26 Added two tests at

black/tests/test_black.py

Lines 2201 to 2219 in 813c8dd

def test_code_with_unexpected_eof_error(self) -> None:
"""
Test that Unexpected EOF error is raised with invalid code
"""
code = "print("
args = ["--check", "--code", code]
error_msg = "error: cannot format <string>: Cannot parse: 2:0: Unexpected EOF\n"
result = CliRunner().invoke(black.main, args)
self.compare_results(result, error_msg, 123)
def test_invalid_input_parsing_error(self) -> None:
"""
Test with invalid code which throws parsing error
"""
code = "print([)"
args = ["--check", "--code", code]
error_msg = f"error: cannot format <string>: Cannot parse: 1:7: {code}\n"
result = CliRunner().invoke(black.main, args)
self.compare_results(result, error_msg, 123)

Please let me know if these are okay or if you were expecting something different.
Thanks!

@ichard26 ichard26 marked this pull request as ready for review December 4, 2021 20:01
Copy link
Collaborator

@ichard26 ichard26 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tanvimoharir hey sorry for such the long wait. The tests you added were good but the second one wasn't necessary as we already have tests that cover that area. I made the first (still relevant) test a bit more of an unit test so changes to --code don't affect it down the line.

Finally I tweaked the error message opting to reuse the message already stored in the TokenError, and fixed the merge conflict. Thank you so much and my apologies once again!

@ichard26 ichard26 merged commit f52cb0f into psf:main Dec 4, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Bad formatting of error messages about EOF in multi-line statements
2 participants