Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable --experimental-string-processing to be default #2188

Open
cooperlees opened this issue May 3, 2021 · 33 comments
Open

Enable --experimental-string-processing to be default #2188

cooperlees opened this issue May 3, 2021 · 33 comments
Assignees
Labels
F: strings Related to our handling of strings S: accepted The changes in this design / enhancement issue have been accepted and can be implemented T: enhancement New feature or request T: style What do we want Blackened code to look like?

Comments

@cooperlees
Copy link
Collaborator

Lets move CI + then the default of black to use the --experimental-string-processing behavior.

Today, if enabled on all black-primer projects, sqlalchemy fails AST check for parenthesis: https://pastebin.com/iaDH1Yg8

Lets enable what we can CI wise and get this stable + default.

@cooperlees cooperlees added T: bug Something isn't working T: enhancement New feature or request T: style What do we want Blackened code to look like? labels May 3, 2021
@ichard26 ichard26 added F: strings Related to our handling of strings S: accepted The changes in this design / enhancement issue have been accepted and can be implemented and removed T: bug Something isn't working labels May 4, 2021
@cooperlees cooperlees self-assigned this May 5, 2021
@cooperlees
Copy link
Collaborator Author

Ok, so we're running this option on most primer projects now. Left to fix:

Please let us know if your repo has any issues with --experimental-string-processing to help us squash bugs before making it the default.

@TylerYep
Copy link

TylerYep commented May 8, 2021

What is this feature supposed to do exactly? I tried it and it transformed the following string, which seems unideal to me, since it breaks the max line length of 88 by a large margin.

I'm using black version 21.5b0.

# input code:
assert str(suffix_arr) == (
    "['$', 'angaroo$', 'angrykangaroo$', 'aroo$', 'garoo$', "
    "'grykangaroo$', 'kangaroo$', 'ngaroo$', 'ngrykangaroo$', "
    "'o$', 'oo$', 'roo$', 'rykangaroo$', 'ykangaroo$']"
)
# black's output code:
assert (
    str(suffix_arr)
    == "['$', 'angaroo$', 'angrykangaroo$', 'aroo$', 'garoo$', 'grykangaroo$', 'kangaroo$', 'ngaroo$', 'ngrykangaroo$', 'o$', 'oo$', 'roo$', 'rykangaroo$', 'ykangaroo$']"
)

@JelleZijlstra
Copy link
Collaborator

Wow, that's pretty unfortunate. It's meant to split strings that are too long, not to make new oversized strings. We should track this as a bug and fix it before turning the feature on by default.

@JelleZijlstra
Copy link
Collaborator

The SQLAlchemy bug alluded to above is #2271.

@bbugyi200
Copy link
Contributor

bbugyi200 commented May 30, 2021

@TylerYep I opened a new issue for this bug (#2284). I'm working on a fix now.

EDIT: This should be fixed with the following PR: #2286

@tsx
Copy link

tsx commented Jan 10, 2022

Hi 👋

I've come across this issue while considering adding a linting rule to my codebase to disallow implicit string concatenation for reasons listed in PEP3126, namely that code can look confusing when string concatenation is next to a comma-delimited list (of arguments or list items). It appears that the new string processing is making that confusion worse.

For example, here's before, 2 arguments clearly visible:

                    flash(
                        'None of the email addresses or domains you entered are valid',
                        'error',
                    )

and here's after, looks too close to a three-argument call if you're not paying very close attention to a missing comma.

                    flash(
                        'None of the email addresses or domains you entered'
                        ' are valid',
                        'error',
                    )

I'd like to propose that in cases like these, black would wrap the split string in parenthesis. Here's a desired outcome that would be a lot clearer:

                    flash(
                        (
                            'None of the email addresses or domains you entered'
                            ' are valid'
                        ),
                        'error',
                    )

On top of that, I encountered what appears to be a bug. Here's a simple reproduction case:

black, 21.12b0 (compiled: no)

x = (
    "xxx xxxxxxx xxx "
    f'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx{a["key"]}&xxxxxxxxxxxxxxxxx'
)

error: cannot format example.py: INTERNAL ERROR: Black produced invalid code on pass 1: f-string expression part cannot include a backslash (, line 4). Please report a bug on https://github.com/psf/black/issues. This invalid output might be helpful: /tmp/blk_iw9ps776.log

Contents of error log:

  File "/home/tsx/projects/close/venv/lib/python3.9/site-packages/black/__init__.py", line 1300, in assert_equivalent
    dst_ast = parse_ast(dst)
  File "/home/tsx/projects/close/venv/lib/python3.9/site-packages/black/parsing.py", line 187, in parse_ast
    raise SyntaxError(first_error)
x = (
    "xxx xxxxxxx xxx "
    f"xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx{a[\"key\"]}&xxxxxxxxxxxxxxxxx"
)

Note that I'm running with skip-string-normalization = true in config file if it matters.

@haridsv
Copy link

haridsv commented Jan 14, 2022

How about indenting continued line, like this:

                    flash(
                        'None of the email addresses or domains you entered'
                            ' are valid',
                        'error',
                    )

@reagle
Copy link

reagle commented Jul 19, 2022

Is this working? It moves the closing triple quote to a new line, but doesn't break the string which is what I thought this would do.

 ❯ black --version
 black, 22.6.0 (compiled: yes)
 ❯ cat ~/.config/black
 [tool.black]
 line-length = 88
 ❯ black --preview --diff reddit-watch.py
 --- reddit-watch.py	2022-07-19 17:46:14.470574 +0000
 +++ reddit-watch.py	2022-07-19 17:46:46.179463 +0000
 @@ -219,11 +219,12 @@
      updated_df.to_csv(updated_fn, index=True, encoding="utf-8-sig", na_rep="NA")
      return updated_fn


  def rotate_archive_fns(updated_fn: str) -> None:
 -    """Given an updated filename, archive it to the zip file and rename it to be the latest. this is the time for all good men"""
 +    """Given an updated filename, archive it to the zip file and rename it to be the latest. this is the time for all good men
 +    """
      print(f"Rotating and archiving {updated_fn=}")
      if not os.path.exists(updated_fn):
          raise RuntimeError(f"{os.path.exists(updated_fn)}")
      # print(f"{updated_fn=}")
      head, tail = os.path.split(updated_fn)
 would reformat reddit-watch.py

 All done! ✨ 🍰 ✨
 1 file would be reformatted.

@yilei
Copy link
Contributor

yilei commented Jul 20, 2022

@reagle this is the current expected behavior. there is some limited handling of docstrings (see #144 and referenced issues), but it won't split. for other non-docstring multi-line strings, it won't split at all since it breaks the code.

@reagle
Copy link

reagle commented Jul 20, 2022

@yilei, thanks for the response. The docs say "Black will split long string literals and merge short ones" and I think docstrings are string literals. Perhaps it could be updated to say "it'll split assigned string literals (which excludes docstrings)" or whatever is true.

@BobDotCom
Copy link

When I enabled this flag, I noticed that in docstring summaries there is an inconsistent newline before the closing quotes. When the summary is longer than the max line length it wraps the closing quotes onto a new line. Because closing quotes aren't normally placed on a newline with this case being the only exception, this makes enforcing docstring formatters such as pydocstringformatter much more difficult. Should this functionality be changed, or even a config flag added to modify it? Currently this makes the new functionality unusable for me, but adding the ability to force enable/disable the newline would resolve this.

Before experimental string processing:

# Before
def foo():
    """This is an example test docstring that is too long to fit within the line-length limit."""
# After
def foo():
    """This is an example test docstring that is too long to fit within the line-length limit."""

With experimental string processing:

# Before
def foo():
    """This is an example test docstring that is too long to fit within the line-length limit."""
def bar():
    """This is an example docstring that fits within the limit."""
# After
def foo():
    """This is an example test docstring that is too long to fit within the line-length limit.
    """
def bar():
    """This is an example docstring that fits within the limit."""

@Jackenmen
Copy link
Contributor

Aside from bugs, to me, the biggest issue with string processing is probably the lack of any special treatment of \n when splitting lines (#1467).

Other than that, it seems that people sometimes also don't like that Black doesn't respect user-made splits if it can merge them into a single line though that's currently the case with the non-preview style as well. I do agree that at least some of the shown examples look better when they're split across multiple lines despite fitting in one but I don't think it's universally the case and I would say that I generally quite like that Black tries to merge strings into one line when they fit. Personally, I want Black to notice these kinds of opportunities for joining lines when I don't, even if it means that I'll have to fight with Black on occasion to keep lines separate.

@ikding
Copy link

ikding commented Jan 13, 2023

Hello,

I am trying the --preview flag on my code base and noticed that the preview flag seems to concatenate user-defined long-string splits in function arg defaults, even if it makes the line too long.

Example:

def open_file_from_long_file_path(
    file_path: str = (
        "some/really/long/file/path/because/it/has/filehash/as/part/of/path/"
        "460e96a3b663f41a8412527424278bc60eb208ec5be1f5943e5dff2fe428a2da/"
        "file.txt"
    ),
) -> list[str]:
    with open(file_path) as f:
        return [line.rstrip() for line in f]

Running black without --preview flag, it doesn't make any changes to the file. But running it with --preview and I get this:

def open_file_from_long_file_path(
    file_path: str = "some/really/long/file/path/because/it/has/filehash/as/part/of/path/460e96a3b663f41a8412527424278bc60eb208ec5be1f5943e5dff2fe428a2da/file.txt",
) -> list[str]:
    with open(file_path) as f:
        return [line.rstrip() for line in f]

FYI I am running black 22.10.0.

@felix-hilden
Copy link
Collaborator

@ikding thanks for letting us know! I've submitted a separate issue for that above 👌

@pauloneves
Copy link

It won't split a log line if it enclosed in triple quotes:

# this will be splited:
x = "aaaaaaaaaa bbbbbbbb cccccccccccc dddddddddddddd eeeeeeeeeeee fffffffffffffffff ggggggggggggggg hhhhhhhhhh"

# this will NOT be splited:
x = """aaaaaaaaaa bbbbbbbb cccccccccccc dddddddddddddd eeeeeeeeeeee fffffffffffffffff ggggggggggggggg hhhhhhhhhh"""

Is this the desired behavior?

I tested with version 22.12.0

@BobDotCom
Copy link

When I enabled this flag, I noticed that in docstring summaries there is an inconsistent newline before the closing quotes. When the summary is longer than the max line length it wraps the closing quotes onto a new line. Because closing quotes aren't normally placed on a newline with this case being the only exception, this makes enforcing docstring formatters such as pydocstringformatter much more difficult. Should this functionality be changed, or even a config flag added to modify it? Currently this makes the new functionality unusable for me, but adding the ability to force enable/disable the newline would resolve this.

Seems to be fixed in #3430

@XuehaiPan
Copy link

Follow the comment at #2188 (comment), the new behavior also removes the f leading character for implicitly concatenated strings.

Expected (expect black --preview leave this unchanged):

actual = "aaa"
expected = "bbb"
message = (
    f"Very very very very very very very very very very very long message. Expected\n"
    f"    {expected!r}\n"
    f"but got\n"
    f"    {actual!r}\n"
    f"    ^^^"
)

I intentionally keep the f leading character to align the multiline string. Although some of the lines do not have an expression to substitute.

black --preview format this to:

actual = "aaa"
expected = "bbb"
message = (
    "Very very very very very very very very very very very long message. Expected\n"
    f"    {expected!r}\n"
    "but got\n"
    f"    {actual!r}\n"
    "    ^^^"
)

black removes the f leading character if there is no expression to substitute. This makes the code harder to maintain.

@felixxm
Copy link

felixxm commented Mar 29, 2023

Hi, I tried this changes on the Django repository and it's pretty destructive 🙃 as it adds parentheses to all multi-line strings, e.g.

diff --git a/tests/template_tests/filter_tests/test_urlize.py b/tests/template_tests/filter_tests/test_urlize.py
index abc227ba6a..d5dbe5925f 100644
--- a/tests/template_tests/filter_tests/test_urlize.py
+++ b/tests/template_tests/filter_tests/test_urlize.py
@@ -24,10 +24,12 @@ class UrlizeTests(SimpleTestCase):
         )
         self.assertEqual(
             output,
-            '<a href="http://example.com/?x=&amp;y=" rel="nofollow">'
-            "http://example.com/?x=&y=</a> "
-            '<a href="http://example.com?x=&amp;y=%3C2%3E" rel="nofollow">'
-            "http://example.com?x=&amp;y=&lt;2&gt;</a>",
+            (
+                '<a href="http://example.com/?x=&amp;y=" rel="nofollow">'
+                "http://example.com/?x=&y=</a> "
+                '<a href="http://example.com?x=&amp;y=%3C2%3E" rel="nofollow">'
+                "http://example.com?x=&amp;y=&lt;2&gt;</a>"
+            ),
         )
 
     @setup({"urlize02": "{{ a|urlize }} {{ b|urlize }}"})
@@ -41,10 +43,12 @@ class UrlizeTests(SimpleTestCase):
         )
         self.assertEqual(
             output,
-            '<a href="http://example.com/?x=&amp;y=" rel="nofollow">'
-            "http://example.com/?x=&amp;y=</a> "
-            '<a href="http://example.com?x=&amp;y=" rel="nofollow">'
-            "http://example.com?x=&amp;y=</a>",
+            (
+                '<a href="http://example.com/?x=&amp;y=" rel="nofollow">'
+                "http://example.com/?x=&amp;y=</a> "
+                '<a href="http://example.com?x=&amp;y=" rel="nofollow">'
+                "http://example.com?x=&amp;y=</a>"
+            ),
         )

@yilei
Copy link
Contributor

yilei commented Mar 29, 2023

@felixxm For background, wrapping multi-line strings in parenthesis was first introduced in #3162 for list/set/tuple literals, then extended to all including function calls in #3292

The list/set/tuple literal cases are important for increased readability / bug preventions.

The reason this was also extended to function calls is that this makes the behavior more consistent and easy to explain. For function calls, it's a much weaker point for bug preventions since accidentally forgetting/adding a comma in the middle of implicitly concatenated strings would often lead to a runtime error (or type checking error) since it usually makes it mismatch the function signature.

One current workaround for not using parenthesis is to use explicit string concatenations, as demonstrated in this playground. However, this does require a lot of adjustments to the existing code to adopt the upcoming new stable style.

I do recognize this is quite a disruptive change especially because they are quite common for logging.XXX / warnings.warn / test code. So I'm personally open to reverting #3292 but still keeping #3162 for literals.

sergiitk added a commit to grpc/grpc that referenced this issue Jun 9, 2023
- Switched  from yapf to black
- Reconfigure isort for black
- Resolve black/pylint idiosyncrasies 

Note: I used `--experimental-string-processing` because black was
producing "implicit string concatenation", similar to what described
here: psf/black#1837. While currently this
feature is experimental, it will be enabled by default:
psf/black#2188. After running black with the
new string processing so that the generated code merges these `"hello" "
world"` strings concatenations, then I removed
`--experimental-string-processing` for stability, and regenerated the
code again.

To the reviewer: don't even try to open "Files Changed" tab 😄 It's
better to review commit-by-commit, and ignore `run black and isort`.
mario-vimal pushed a commit to mario-vimal/grpc that referenced this issue Jun 15, 2023
- Switched  from yapf to black
- Reconfigure isort for black
- Resolve black/pylint idiosyncrasies 

Note: I used `--experimental-string-processing` because black was
producing "implicit string concatenation", similar to what described
here: psf/black#1837. While currently this
feature is experimental, it will be enabled by default:
psf/black#2188. After running black with the
new string processing so that the generated code merges these `"hello" "
world"` strings concatenations, then I removed
`--experimental-string-processing` for stability, and regenerated the
code again.

To the reviewer: don't even try to open "Files Changed" tab 😄 It's
better to review commit-by-commit, and ignore `run black and isort`.
jacobtylerwalls pushed a commit to pylint-dev/pylint that referenced this issue Jun 18, 2023
sergiitk added a commit to grpc/psm-interop that referenced this issue Nov 8, 2023
- Switched  from yapf to black
- Reconfigure isort for black
- Resolve black/pylint idiosyncrasies 

Note: I used `--experimental-string-processing` because black was
producing "implicit string concatenation", similar to what described
here: psf/black#1837. While currently this
feature is experimental, it will be enabled by default:
psf/black#2188. After running black with the
new string processing so that the generated code merges these `"hello" "
world"` strings concatenations, then I removed
`--experimental-string-processing` for stability, and regenerated the
code again.

To the reviewer: don't even try to open "Files Changed" tab 😄 It's
better to review commit-by-commit, and ignore `run black and isort`.
@crhf
Copy link

crhf commented Mar 21, 2024

How do I use this feature? I have this file long.py with a single line:

s = "0123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789"

When doing python3 -m black -l 80 --unstable long.py, the file is left unchanged. I would expect the string to be split.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
F: strings Related to our handling of strings S: accepted The changes in this design / enhancement issue have been accepted and can be implemented T: enhancement New feature or request T: style What do we want Blackened code to look like?
Projects
None yet
Development

No branches or pull requests