Support named escapes (`\N{...}`) in string processing #2319

Jackenmen · 2021-06-09T09:29:53Z

Fixes #1468 (or at least the issue that's in the comment there since they're not as closely related as I initially thought)

Not an awfully big amount of tests added here but I'm not sure if there's a need for more.

felix-hilden

Hi, thanks for submitting! I'm not familiar with this part of the code base so take it with a grain of salt, but I left some thoughts below. I'd like for the more experienced maintainers to take a look too.

src/black/trans.py

tests/data/long_strings.py

Co-authored-by: Felix Hildén <felix.hilden@gmail.com>

src/black/trans.py

felix-hilden

In my opinion this looks much better now, thanks a ton for being so speedy to modify!

JelleZijlstra · 2021-06-09T13:59:33Z

I wonder if this could be generalized into other string pieces that can't be split, so we don't need special-case logic for the different kinds. Instead, we could just generate a single list of unsplittable slices.

felix-hilden · 2021-06-09T14:00:10Z

Yep it should definitely be done!

Jackenmen · 2021-06-09T14:17:27Z

I wonder if this could be generalized into other string pieces that can't be split, so we don't need special-case logic for the different kinds. Instead, we could just generate a single list of unsplittable slices.

I can look into this but before I do I would like to know if we care that some spans might overlap? Because I could either just make a list of slices from all functions that generate the slices (so _get_nameescape_slices and fexpr_slices at the current time) and just use that or I could go further - sort it and make a new list of non-overlapping spans.

JelleZijlstra · 2021-06-09T14:20:13Z

Overlap seems fine with how we're using these slices. Actually it may be more efficient to generate a set of illegal indices, since the current code looks quadratic to me. Which reminds me I should run some profiling for #2314.

Jackenmen · 2021-06-09T14:38:52Z

I went with the set idea as that indeed seems like a more performant way and isn't really hard to do either.

Sadly this causes the diff to be more complicated, if that's a problem I can split this into a separate PR that can be reviewed separately after this one is merged.

src/black/trans.py

Co-authored-by: Jelle Zijlstra <jelle.zijlstra@gmail.com>

felix-hilden · 2021-06-09T17:13:22Z

If we wanted, I bet we could construct a single function that loops the string through once, cycling through N escape and f-string modes and generating the index ranges. But if this works, then it could be enough for this PR!

ichard26 · 2021-06-09T19:35:49Z

Thank you so much for your contribution! This project is only possible by contributions like these 🖤. You're awesome, @jack1142. Many thanks for pretty much beta testing the experimental string processing handling extremely early. I'm sure you're the reason why this feature will be a lot less buggy. Which is great since we want to push a release with it enabled by default soon.

Jackenmen added 4 commits June 9, 2021 11:24

Add a bunch of tests

60e10d8

Add regression test

2d7bc8a

Add nameescape slice detection

b90fbd4

Raise here instead

f7a3761

felix-hilden reviewed Jun 9, 2021

View reviewed changes

src/black/trans.py Outdated Show resolved Hide resolved

src/black/trans.py Show resolved Hide resolved

tests/data/long_strings.py Show resolved Hide resolved

Jackenmen and others added 5 commits June 9, 2021 12:21

Simplify breaks_nameescape_expression()

9aae115

Co-authored-by: Felix Hildén <felix.hilden@gmail.com>

Add tests for escapes

2d3fb5f

Fix escapes

f0ac851

Use more descriptive variable names

91b2122

Add changelog entry

0955f32

felix-hilden reviewed Jun 9, 2021

View reviewed changes

src/black/trans.py Outdated Show resolved Hide resolved

Simplify the logic of backslash tracing

a3396e3

felix-hilden approved these changes Jun 9, 2021

View reviewed changes

Generalize checks for unsplittable expressions

e420dbe

JelleZijlstra reviewed Jun 9, 2021

View reviewed changes

src/black/trans.py Outdated Show resolved Hide resolved

Update src/black/trans.py

c1276fd

Co-authored-by: Jelle Zijlstra <jelle.zijlstra@gmail.com>

JelleZijlstra approved these changes Jun 9, 2021

View reviewed changes

JelleZijlstra merged commit 62402a3 into psf:main Jun 9, 2021

Jackenmen deleted the fix_splitting_for_escape_sequences branch June 9, 2021 20:42

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support named escapes (`\N{...}`) in string processing #2319

Support named escapes (`\N{...}`) in string processing #2319

Jackenmen commented Jun 9, 2021 •

edited

felix-hilden left a comment

felix-hilden left a comment

JelleZijlstra commented Jun 9, 2021

felix-hilden commented Jun 9, 2021

Jackenmen commented Jun 9, 2021

JelleZijlstra commented Jun 9, 2021

Jackenmen commented Jun 9, 2021

felix-hilden commented Jun 9, 2021

ichard26 commented Jun 9, 2021

Support named escapes (\N{...}) in string processing #2319

Support named escapes (\N{...}) in string processing #2319

Conversation

Jackenmen commented Jun 9, 2021 • edited

felix-hilden left a comment

Choose a reason for hiding this comment

felix-hilden left a comment

Choose a reason for hiding this comment

JelleZijlstra commented Jun 9, 2021

felix-hilden commented Jun 9, 2021

Jackenmen commented Jun 9, 2021

JelleZijlstra commented Jun 9, 2021

Jackenmen commented Jun 9, 2021

felix-hilden commented Jun 9, 2021

ichard26 commented Jun 9, 2021

Support named escapes (`\N{...}`) in string processing #2319

Support named escapes (`\N{...}`) in string processing #2319

Jackenmen commented Jun 9, 2021 •

edited