Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix string parsing with newline #123

Merged

Conversation

felixonmars
Copy link
Contributor

When looping over a joined str, if a node is ast.Str and the value is
just a newline "\n", the write() function adds an additional indentation
after it, which fails to represent the original string. By calling
self.result.append() here directly the issue is resolved.

The added test could show the issue. With code_gen unmodifed, it fails
with the following error:

AssertionError: "if 1:\n    x = f'{host}\\n\\t{port}\\n    '" != "if
1:\n    x = f'{host}\\n\\t{port}\\n'"
  if 1:
      -     x = f'{host}\n\t{port}\n    '?
      ----
      +     x = f'{host}\n\t{port}\n'

Which is exactly the problem.

This fixes parsing issues with many of Python 3.7's stdlib. Namely the
following ones:

/usr/lib/python3.7/warnings.py
/usr/lib/python3.7/netrc.py
/usr/lib/python3.7/test/test_embed.py
/usr/lib/python3.7/test/support/testresult.py
/usr/lib/python3.7/idlelib/grep.py

When looping over a joined str, if a node is ast.Str and the value is
just a newline "\n", the write() function adds an additional indentation
after it, which fails to represent the original string. By calling
self.result.append() here directly the issue is resolved.

The added test could show the issue. With code_gen unmodifed, it fails
with the following error:

```
AssertionError: "if 1:\n    x = f'{host}\\n\\t{port}\\n    '" != "if
1:\n    x = f'{host}\\n\\t{port}\\n'"
  if 1:
      -     x = f'{host}\n\t{port}\n    '?
      ----
      +     x = f'{host}\n\t{port}\n'
```

Which is exactly the problem.

This fixes parsing issues with many of Python 3.7's stdlib. Namely the
following ones:

/usr/lib/python3.7/warnings.py
/usr/lib/python3.7/netrc.py
/usr/lib/python3.7/test/test_embed.py
/usr/lib/python3.7/test/support/testresult.py
/usr/lib/python3.7/idlelib/grep.py
Copy link
Owner

@berkerpeksag berkerpeksag left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please 1) add your name to AUTHORS 2) add a note to docs/changelog.rst?

@@ -566,7 +566,7 @@ def visit_Str(self, node, is_joined=False):
def recurse(node):
for value in node.values:
if isinstance(value, ast.Str):
self.write(value.s)
self.result.append(value.s)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'\n' is already special-cased in the write() method:

astor/astor/code_gen.py

Lines 171 to 172 in d9f6679

elif item == '\n':
newline()

What do you think about fixing this in the write() method? (or even in the newline() method) Perhaps we could handle the '\n' case in the newline() method:

if node is None and extra == 0:
    self.result.append('\n')

Perhaps we could even add a new without_indentation keyword argument to minimize breaking user code?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit unsure why it was special-cased before - If I remove the special case at all, all the tests pass including the new one. I tried to look up git history but didn't find valuable info :(

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it now - so I think writing self.newline in the cases where indentation is needed would be more explicit. Tests green again :)

Copy link
Owner

@berkerpeksag berkerpeksag left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this looks pretty good. Could you please fix the merge conflict? (sorry about that, by the way! I just merged #125.)

@felixonmars
Copy link
Contributor Author

Fixed :)

@berkerpeksag berkerpeksag merged commit c72349f into berkerpeksag:master Mar 22, 2019
@berkerpeksag
Copy link
Owner

Thank you!

@felixonmars felixonmars deleted the fix-parse-string-with-newline branch March 22, 2019 17:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants