Skip to content

Commit

Permalink
Improve f-string expression detection regex so ... (#2437)
Browse files Browse the repository at this point in the history
we don't accidentally add backslashes to them when normalizing quotes
because that's invalid syntax!

The problem this commit fixes is that matches would eat too much
blocking important matches to occur. For example, here's one f-string
body:

    {a}{b}{c}

I know there's no risk of introducing backslashes here, but the regex
already goes sideways with this. Throwing this example at regex101
I get:

    {a}{b}{c}   # The As and Bs are the two matches, and the upper
    ---- ----   # case letters are the groups with those matches.
    aAaa bbBb

... we've missed the middle expression (so if any backslashes in a
more complex example were introduced there we wouldn't bail out
even though we should -- hence the bug). As it stands the regex
needs somesort of extra character (or the start/end of the body)
around the expressions but that isn't always the case as shown
above.

The fix implemented here is to turn the "eat a surrounding non-curly
bracket character" groups ie. `(?:[^{]|^)` and `(?:[^}]|$)` into
negative lookaheads and lookbehinds. This still guarantees the
already specified rules but without problematically eating extra
characters ^^
  • Loading branch information
ichard26 committed Aug 23, 2021
1 parent 104aec5 commit 8c04847
Show file tree
Hide file tree
Showing 3 changed files with 14 additions and 2 deletions.
2 changes: 2 additions & 0 deletions CHANGES.md
Expand Up @@ -7,6 +7,8 @@
- Add support for formatting Jupyter Notebook files (#2357)
- Move from `appdirs` dependency to `platformdirs` (#2375)
- Present a more user-friendly error if .gitignore is invalid (#2414)
- The failsafe for accidentally added backslashes in f-string expressions has been
hardened to handle more edge cases during quote normalization (#2437)

### Integrations

Expand Down
4 changes: 2 additions & 2 deletions src/black/strings.py
Expand Up @@ -190,9 +190,9 @@ def normalize_string_quotes(s: str) -> str:
if "f" in prefix.casefold():
matches = re.findall(
r"""
(?:[^{]|^)\{ # start of the string or a non-{ followed by a single {
(?:(?<!\{)|^)\{ # start of the string or a non-{ followed by a single {
([^{].*?) # contents of the brackets except if begins with {{
\}(?:[^}]|$) # A } followed by end of the string or a non-}
\}(?:(?!\})|$) # A } followed by end of the string or a non-}
""",
new_body,
re.VERBOSE,
Expand Down
10 changes: 10 additions & 0 deletions tests/data/string_quotes.py
Expand Up @@ -51,6 +51,11 @@
'\'{z}\' {y * " "}'
'{y * x} \'{z}\''

# We must bail out if changing the quotes would introduce backslashes in f-string
# expressions. xref: https://github.com/psf/black/issues/2348
f"\"{b}\"{' ' * (long-len(b)+1)}: \"{sts}\",\n"
f"\"{a}\"{'hello' * b}\"{c}\""

# output

""""""
Expand Down Expand Up @@ -100,3 +105,8 @@
f"{y * x} '{z}'"
"'{z}' {y * \" \"}"
"{y * x} '{z}'"

# We must bail out if changing the quotes would introduce backslashes in f-string
# expressions. xref: https://github.com/psf/black/issues/2348
f"\"{b}\"{' ' * (long-len(b)+1)}: \"{sts}\",\n"
f"\"{a}\"{'hello' * b}\"{c}\""

0 comments on commit 8c04847

Please sign in to comment.