Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feature_request(formatting): wrap long strings #182

Closed
Kristinita opened this issue Apr 30, 2018 · 28 comments · Fixed by #1132
Closed

feature_request(formatting): wrap long strings #182

Kristinita opened this issue Apr 30, 2018 · 28 comments · Fixed by #1132
Labels
T: style What do we want Blackened code to look like?

Comments

@Kristinita
Copy link

Kristinita commented Apr 30, 2018

1. Summary

It would be nice, if Black will wrap long strings.

2. Example

I have SashaLongStrings.py file:

print("She's the one for me, She's my ecstasy, She's the one I need. She's one in a million, She's once in a lifetime, She made me discover one of the stars above us.")

Line contains 168 characters.

I run command:

black --line-length 120 SashaLongStrings.py

Result:

print("She's the one for me, She's my ecstasy, She's the one I need. She's one in a million, She's once in a lifetime, "
      "She made me discover one of the stars above us.")

Lines contains ≤ 120 symbols.

3. Details

See more details in my question in Software Recommendations.

Thanks.

@ambv
Copy link
Collaborator

ambv commented Apr 30, 2018

This is something we will probably do in the future but it's a trickier request than it looks on the surface.

We would need to handle breaking strings safely in face of escapes, formatting variables (old style %s and {0}, new style f-strings), method calls on the resulting string, and so on.

@davidism
Copy link

I know Black doesn't split strings yet, but can we preemptively make a style decision about leading / trailing whitespace when splitting strings?

When wrapping on words, I like to keep the separating whitespace at the beginning of the next line. Visually I think it's easier to recognize that the string is wrapped, and I think it feels like the "binary op at beginning of line" rule.

raise TypeError(
    "Pretend that this"
    " is a long line."
)

Ultimately I don't care what way we decide, I just want to be consistent.

@mlucool
Copy link

mlucool commented Apr 17, 2019

@ambv is this feature still planned?

@JelleZijlstra JelleZijlstra added the T: style What do we want Blackened code to look like? label May 5, 2019
@ambv ambv added this to To do in AST-unsafe changes May 9, 2019
@NicolasPA
Copy link

A detail to take into account, Black doesn't behave well with wrapped long strings in dictionaries.

  1. Unwrapped long string, running black on it won't change anything as said earlier:
long_string = {
    "a": "All kinds of interesting questions which the science knowledge only adds to the excitement, the mystery and the awe of a flower.",
    "b": "azazd",
}
  1. Wrapped (using Pycharm suggested intention action):
long_string = {
    "a": "All kinds of interesting questions which the science knowledge only adds to the excitement, the mystery and "
         "the awe of a flower.", 
    "b": "azazd",
}
  1. Running black after that wrapping:
long_string = {
    "a": "All kinds of interesting questions which the science knowledge only adds to the excitement, the mystery and "
    "the awe of a flower.",
    "b": "azazd",
}

The problem here is that the wrapped part of the line looks like a new dictionary item at first sight. It would be nice to keep the formatting from point 2.

@KeirSimmons
Copy link

Is this still planned?

stevejalim pushed a commit to mdn/developer-portal that referenced this issue Sep 17, 2019
This commit contains some (manual) changes to over-long lines caught by flake8 that black _didn't_ auto-fix, because they were long strings (see psf/black#182)
stevejalim pushed a commit to mdn/developer-portal that referenced this issue Sep 18, 2019
This changeset contains a number of squashed commits towards standardising the formatting and linting of the devportal project. (It wasn't bad before, just good to get this in during a quietish patch of development).

**Please view the README and start using `therapist` to avoid other developers' commits containing formatting changes to parts of files they didn't explicitly work on**

Closes #135 

## Summarised changes: 
* full-codebase passes made with `black`, `isort`, `flake8`, `eslint` and `prettier` to bring everything into line
* add [`therapist`](https://github.com/rehandalal/therapist) as a framework for running the above tooling (either via a pre-commit hook or directly - both paths are available)

+7,255 -3,637 looks huge, but the vast majority of the diff is standardising formatting


## Squashed commits:

* 135: Full pass with `black`as first step towards autoformatting

252 files reformatted, 42 files left unchanged.

Done with `black --exclude node_modules .` to avoid unnecessarily reformatting third-party package helper Python in JS deps

Includes migration files for consistency, even though it adds noise (209/252 files are migrations...)

Tests still passing.

* 135: add isort with black-compatible config

Created a separate .isort.cfg to make it explicit

Tests passing fine after changes

* 135: tighten up flake8 to also enforce 88-char lines, matching black

This commit contains some (manual) changes to over-long lines caught by flake8 that black _didn't_ auto-fix, because they were long strings (see psf/black#182)

* 135: Run prettier across all SCSS, ahead of hooking it into therapist's pre-commit tasks

* 135: Tune prettier and eslint's ignore options

Specifically:
- prettier: ignore HTML, SVG, JSON and Python
- eslint: ignore HTML, SVG and Python

While this wouldn't be needed for editor-based formatting or linting, nor linting explicitly staged/passed files, this change is so that when we _do_ run them across the entire `developerportal` module (eg, in CI), they won't gripe about things they are not intented to be looking at anyway.

* 135: Add Therapist to the project, running the linters and formatters recently added/tweaked

At this point, the assumption is you have the following installed on your host:
* black
* flake8
* eslint and eslint-config-prettier
* isort
* prettier
(I'll deal with making that easier in a different commit.)

Install `therapist` (https://github.com/rehandalal/therapist):

	$ pip install therapist

Now install the pre-commit hook that will trigger Therapist automatically:

	$ therapist install
	Installing pre-commit hook...	DONE

Now, when you commit a change, the staged changes will be checked by one or more of black, isort, eslint and prettier, as appropriate. See `.therapist.yml` for the configuration.

Alternatively, if you wanted to run it across the whole codebase (as we might well do in CI), run:

	$ therapist run developerportal/
	black ............................................................... [SUCCESS]
	ESLint .............................................................. [SKIPPED]
	flake8 .............................................................. [SUCCESS]
	isort ............................................................... [SUCCESS]
	Prettier ............................................................ [SUCCESS]

-------------------------------------------------------------------------------
Completed in: 3.67s

And if you want therapist to auto-fix things using black, prettier and/or eslint, run:

	$ therapist run developerportal --fix

* 135: Update Readme to mention how to set up and use `therapist` as a linter/formatter runner
@bbugyi200
Copy link
Contributor

bbugyi200 commented Nov 9, 2019

Is this still planned?

I am working on this feature now (see #1132); however, it is still far from ready. It would help if any who are interested could take a look at the following test datafile's contents and let me know if they strongly disagree with any of the choices made (all lines after the # output comment specify how the code should look once blackened):

x = "This is a really long string that can't possibly be expected to fit all together on one line. In fact it may even take up three or more lines... like four or five... but probably just three."

print("This is a really long string inside of a print statement with extra arguments attached at the end of it.", x, y, z)

print("This is a really long string inside of a print statement with no extra arguments attached at the end of it.")

D1 = {"The First": "This is a really long string that can't possibly be expected to fit all together on one line. Also it is inside a dictionary, so formatting is more difficult.", "The Second": "This is another really really (not really) long string that also can't be expected to fit on one line and is, like the other string, inside a dictionary."}

D2 = {1.0: "This is a really long string that can't possibly be expected to fit all together on one line. Also it is inside a dictionary, so formatting is more difficult.", 2.0: "This is another really really (not really) long string that also can't be expected to fit on one line and is, like the other string, inside a dictionary."}

D3 = {x: "This is a really long string that can't possibly be expected to fit all together on one line. Also it is inside a dictionary, so formatting is more difficult.", y: "This is another really really (not really) long string that also can't be expected to fit on one line and is, like the other string, inside a dictionary."}

func_with_keywords(my_arg, my_kwarg="Long keyword strings also need to be wrapped, but they will probably need to be handled a little bit differently.")

bad_split1 = (
    "But what should happen when code has already been formatted but in the wrong way?"
    " Like with a space at the beginning instead of the end."
    " Or what about when it is split too soon?"
)

bad_split2 = (
    "But what should happen when code has already "
    "been formatted but in the wrong way? Like "
    "with a space at the beginning instead of the "
    "end. Or what about when it is split too "
    "soon?"
)

bad_split3 = (
    "What if we have inline comments on "  # First Comment
    "each line of a bad split? In that "  # Second Comment
    "case, we should just leave it alone."  # Third Comment
)

bad_split_func1(
    "But what should happen when code has already "
    "been formatted but in the wrong way? Like "
    "with a space at the beginning instead of the "
    "end. Or what about when it is split too "
    "soon?",
    xxx, yyy, zzz
)

bad_split_func2(
    xxx, yyy, zzz,
    long_string_kwarg="But what should happen when code has already "
                      "been formatted but in the wrong way? Like "
                      "with a space at the beginning instead of the "
                      "end. Or what about when it is split too "
                      "soon?",
)

raw_string = r"This is a long raw string. When re-formatting this string, black needs to make sure it prepends the 'r' onto the new string."

fmt_string1 = "We also need to be sure to preserve any and all {} which may or may not be attached to the string in question.".format("method calls")

fmt_string2 = "But what about when the string is {} but {}".format("short", "the method call is really really really really really really really really long?")

old_fmt_string1 = "While we are on the topic of %s, we should also note that old-style formatting must also be preserved, since some %s still uses it." % ("formatting", "code")

old_fmt_string2 = "This is a %s %s %s %s" % ("really really really really really", "old", "way to format strings!", "Use f-strings instead!")

old_fmt_string3 = "Whereas only the strings after the percent sign were long in the last example, this example uses a long initial string as well. This is another %s %s %s %s" % ("really really really really really", "old", "way to format strings!", "Use f-strings instead!")

fstring = f"f-strings definitely make things more {difficult} than they need to be for black. But boy they sure are handy. The problem is that some lines will need to have the 'f' whereas others do not. This {line}, for example, needs one."

comment_string = "Long lines with inline comments should have their comments appended to the reformatted string's enclosing right parentheses."  # This comment gets thrown to the bottom.

arg_comment_string = print("Long lines with inline comments which are apart of (and not the only member of) an argument list should have their comments appended to the reformatted string's enclosing left parentheses.",  # This comment gets thrown to the top.
    "Arg #2", "Arg #3", "Arg #4", "Arg #5")

pragma_comment_string1 = "Lines which end with an inline pragma comment of the form `# <pragma>: <...>` should be left alone."  # noqa: E501

pragma_comment_string2 = "Lines which end with an inline pragma comment of the form `# <pragma>: <...>` should be left alone."  # noqa

"""This is a really really really long triple quote string and it should not be touched."""

triple_quote_string = """This is a really really really long triple quote string assignment and it should not be touched."""

# output

x = (
    "This is a really long string that can't possibly be expected to fit all together "
    "on one line. In fact it may even take up three or more lines... like four or "
    "five... but probably just three."
)

print(
    (
        "This is a really long string inside of a print statement with extra arguments "
        "attached at the end of it."
    ),
    x,
    y,
    z,
)

print(
    "This is a really long string inside of a print statement with no extra arguments "
    "attached at the end of it."
)

D1 = {
    "The First": (
        "This is a really long string that can't possibly be expected to fit all "
        "together on one line. Also it is inside a dictionary, so formatting is more "
        "difficult."
    ),
    "The Second": (
        "This is another really really (not really) long string that also can't be "
        "expected to fit on one line and is, like the other string, inside a "
        "dictionary."
    ),
}

D2 = {
    1.0: (
        "This is a really long string that can't possibly be expected to fit all "
        "together on one line. Also it is inside a dictionary, so formatting is more "
        "difficult."
    ),
    2.0: (
        "This is another really really (not really) long string that also can't be "
        "expected to fit on one line and is, like the other string, inside a "
        "dictionary."
    ),
}

D3 = {
    x: (
        "This is a really long string that can't possibly be expected to fit all "
        "together on one line. Also it is inside a dictionary, so formatting is more "
        "difficult."
    ),
    y: (
        "This is another really really (not really) long string that also can't be "
        "expected to fit on one line and is, like the other string, inside a "
        "dictionary."
    ),
}

func_with_keywords(
    my_arg,
    my_kwarg=(
        "Long keyword strings also need to be wrapped, but they will probably need to "
        "be handled a little bit differently."
    ),
)

bad_split1 = (
    "But what should happen when code has already been formatted but in the wrong way? "
    "Like with a space at the beginning instead of the end. Or what about when it is "
    "split too soon?"
)

bad_split2 = (
    "But what should happen when code has already been formatted but in the wrong way? "
    "Like with a space at the beginning instead of the end. Or what about when it is "
    "split too soon?"
)

bad_split3 = (
    "What if we have inline comments on "  # First Comment
    "each line of a bad split? In that "  # Second Comment
    "case, we should just leave it alone."  # Third Comment
)

bad_split_func1(
    (
        "But what should happen when code has already been formatted but in the wrong "
        "way? Like with a space at the beginning instead of the end. Or what about "
        "when it is split too soon?"
    ),
    xxx,
    yyy,
    zzz,
)

bad_split_func2(
    xxx,
    yyy,
    zzz,
    long_string_kwarg=(
        "But what should happen when code has already been formatted but in the wrong "
        "way? Like with a space at the beginning instead of the end. Or what about "
        "when it is split too soon?"
    ),
)

raw_string = (
    r"This is a long raw string. When re-formatting this string, black needs to make "
    r"sure it prepends the 'r' onto the new string."
)

fmt_string1 = (
    "We also need to be sure to preserve any and all {} which may or may not be "
    "attached to the string in question.".format("method calls")
)

fmt_string2 = "But what about when the string is {} but {}".format(
    "short",
    "the method call is really really really really really really really really long?",
)

old_fmt_string1 = (
    "While we are on the topic of %s, we should also note that old-style formatting "
    "must also be preserved, since some %s still uses it." % ("formatting", "code")
)

old_fmt_string2 = "This is a %s %s %s %s" % (
    "really really really really really",
    "old",
    "way to format strings!",
    "Use f-strings instead!",
)

old_fmt_string3 = (
    "Whereas only the strings after the percent sign were long in the last example, "
    "this example uses a long initial string as well. This is another %s %s %s %s" % (
        "really really really really really",
        "old",
        "way to format strings!",
        "Use f-strings instead!",
    )
)

fstring = (
    f"f-strings definitely make things more {difficult} than they need to be for "
    "black. But boy they sure are handy. The problem is that some lines will need to "
    f"have the 'f' whereas others do not. This {line}, for example, needs one."
)

comment_string = (
    "Long lines with inline comments should have their comments appended to the "
    "reformatted string's enclosing right parentheses."
)  # This comment gets thrown to the bottom.

arg_comment_string = print(
    (  # This comment gets thrown to the top.
        "Long lines with inline comments which are apart of (and not the only member "
        "of) an argument list should have their comments appended to the reformatted "
        "string's enclosing left parentheses."
    ),
    "Arg #2",
    "Arg #3",
    "Arg #4",
    "Arg #5",
)

pragma_comment_string1 = "Lines which end with an inline pragma comment of the form `# <pragma>: <...>` should be left alone."  # noqa: E501

pragma_comment_string2 = "Lines which end with an inline pragma comment of the form `# <pragma>: <...>` should be left alone."  # noqa

"""This is a really really really long triple quote string and it should not be touched."""

triple_quote_string = """This is a really really really long triple quote string assignment and it should not be touched."""

@takkaria
Copy link

takkaria commented Nov 9, 2019

Here's a disagreement for you 😁 When wrapping strings across lines, I put spaces at the beginning of the line, not the end. It's easier to see if are you accidentally (or deliberately) smushing words together because the spaces are grouped together vertically.

@bbugyi200
Copy link
Contributor

bbugyi200 commented Nov 9, 2019

@takkaria I was on the fence about this. Don't you think that making sure a long string is always on its own level of indentation mitigates this issue enough? For example, using my current strategy, the line

print("This is a really long string inside of a print statement with extra arguments attached at the end of it.", "And then print this out too.")

will NOT be re-formatted to

print(
    "This is a really long string inside of a print statement with extra arguments "
    "attached at the end of it.",
    "And then print this out too.",
)

since I agree this would make it too difficult to distinguish between strings which represent distinct arguments as opposed to those which are segments of a wrapped string. The line would instead be re-formatted to

print(
    (
        "This is a really long string inside of a print statement with extra arguments "
        "attached at the end of it."
    ),
    "And then print this out too.",
)

@davidism
Copy link

davidism commented Nov 9, 2019

I already left a comment about spacing (preferring spaces at the beginning) here: #182 (comment) It got a net +17 votes, including from @ambv, and no responses for the other style. I'd prefer spaces at the beginning because it matches other "at the beginning rules" such as binary ops and logic ops.

@bbugyi200
Copy link
Contributor

bbugyi200 commented Nov 9, 2019

@davidism See my response to @takkaria above. Does this make a difference for you?

Also, I took into account your earlier comment when thinking of a strategy for this feature, but I don't think it is clear whether the upvotes were in response to using a space at the beginning (as opposed to the end) or this quote from the same comment (which is far more agreeable):

Ultimately I don't care what way we decide, I just want to be consistent.

@davidism
Copy link

davidism commented Nov 9, 2019

It seems to be about how wrapped lines are in parens to distinguish them, not where the spacing between wrapped words goes. That does make them more distinct, but it doesn't address the visibility of spacing, or the consistency with other wrapping rules.

When I wrote the comment, I didn't particularly care, but given the support it got, I pretty much assumed that was the direction things would go.

@bbugyi200
Copy link
Contributor

bbugyi200 commented Nov 10, 2019

@davidism IMO, lack of "distinctness" is the main motivation for placing the space at the beginning instead of the end. If using parentheses and an increased indentation depth solves this lack of "distinctness" problem (which I believe it does), then we have lost our primary motivator for using a string format which is clearly less readable.

I think it feels like the "binary op at beginning of line" rule.

Strings are not operations; spaces are not operators. The analogy does not hold.

When I wrote the comment, I didn't particularly care, but given the support it got, I pretty much assumed that was the direction things would go.

Again, I'm not completely convinced that all (or even most) of those upvotes were targeted towards the "spaces at the beginning" argument. Many were likely targeted towards the argument you made for consistency. Moreover, I'd be curious to know how much support the "spaces at the beginning" argument would have gotten had the alternative of using parentheses and increased indentation been pitched alongside it. I'm sure some would still support your argument, but certainly not as many. I myself supported your proposal at first.

Finally, if we are truly aiming to be consistent, we should turn our attention to how popular Python IDEs and tools for other languages which support auto-formatting handle this task. If you look at @NicolasPA's example of how Pycharm handles this refactoring, you will notice that the spaces are left at the end. What's more, all auto-formatters I have used for other languages that have the ability to wrap long strings---admittedly not many---have done so by leaving the space at the end of the line (for example, this is the strategy that rustfmt uses to wrap long strings in Rust code).

It is also worth noting that the source code of this project (i.e. black.py) currently leaves spaces at the end of the string when splitting long strings across multiple lines.

@takkaria
Copy link

takkaria commented Nov 10, 2019

@bbugyi200 I wasn't talking about ensuring that different arguments were clearly different strings. I was talking about the visibility of whitespace within a string.

I think it's better to have spaces at the beginning of strings, where they are more obvious - and therefore more obviously missing if you don't put them in - rather than at the end of the previous string. For me this is orthogonal to whether differentiating different arguments to the same function, which I agree your format works for.

The value I see in this is that it's easy to forget a non-visible character at the end of a line, but hard to forget it at the beginning. If you're writing strings manually this is an issue... I suppose it is less of an issue if you have 'black on save' set up in your editor and you never have to manually format strings again. But it probably depends on the behaviour of the following case.

Can I ask, currently, how would the following be reformatted? Would 'thin' and 'ks' be rejoined and then split onto the second line? Or would it be left alone?

bad_split = (
    "There's a star man waiting in the sky / He'd like to come and meet you but he thin"
    "ks he'd blow your mind"
)

If it gets reformatted then I guess it's easy-ish to tell if you accidentally missed off a space at the end of a line, because words will get mushed together at the beginning of the next line. But if no reformatting would take place in that case, I think it would be better to put the space at the beginning of the line instead.

I'm also interested in the related following case. What happens here?

no_spaces_src = "ChumbawambawereaBritishrockbandthatformedin1982andendedin2012Thebandconstantlyshiftedinmusicalstyledrawingongenressuchaspunkrockpopandfolk"

# does it come out like this?
no_spaces_reformatted = (
    "ChumbawambawereaBritishrockbandthatformedin1982andendedin2012Thebandconstantlyshif"
    "tedinmusicalstyledrawingongenressuchaspunkrockpopandfolk"
)

@bbugyi200
Copy link
Contributor

bbugyi200 commented Nov 10, 2019

@takkaria To answer your first question, my plan is to use a strategy such that the snippet

bad_split = (
    "There's a star man waiting in the sky / He'd like to come and meet you but he thin"
    "ks he'd blow your mind"
)

would be reformatted to

bad_split = (
    "There's a star man waiting in the sky / He'd like to come and meet you but he "
    "thinks he'd blow your mind"
)

I am not yet sure how I am going to actually implement this check for bad splits. This is something I want to try to look into today. I do, however, agree that this functionality is important (even mandatory) if we hope to leave spaces at the end of the line. This is working now.

With regards to your second case, my initial strategy was such that the snippet

data = (
    "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
).encode()

would be reformatted to

data = (
    "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
    "xxxxxxxxxxxxxxxxxxxx"
).encode()

But I've recently decided against this strategy. In the final solution, I plan to fix this so that long strings which contain no spaces are not split. I am curious to know how everyone else feels this particular case (long strings with no spaces) should be handled. Split it? Or leave it alone?

@bbugyi200
Copy link
Contributor

bbugyi200 commented Nov 10, 2019

Okay. This is pretty much done, functionally at least. I still need to write some corner-case tests and cleanup/refactor the new code additions, so it will probably take me another two or three weeks before I feel ready to un-WIP the PR. But, for any who might be interested, it should be ready for alpha-testing at this point.

Also, I updated the example I posted a few days ago so it better reflects the functionality of this feature as it stands now. I will try to keep this up to date as I add more tests over the next few weeks.

@sapphire-janrain
Copy link

Just to make it super ultra clear so there's no ambiguity, +1 for spaces at the beginning of the line.

@sapphire-janrain
Copy link

Clearer still, putting argumentation in a separate comment so +1s on the above won't be mistaken.

It's way easier to read this way. I think the extra indentation is a good idea too especially for spaceless languages, but the space at the beginning still helps. It's easier to find the space, too. I'm used to it from markdown and mediawiki and such where there's no ending quote to visually indicate the space at the end of the line.

Also, thanks for working on this! The lack of string wrapping has always been a pain, really glad it's being worked on.

@davidism
Copy link

davidism commented Nov 15, 2019

The more I think about this, the less I'm sure it's something Black should be doing. The visual cadence of strings, whether they contain sentences or some other type of semantic markings like HTML or Jinja expressions, is important for easy reading. Wrapping the text arbitrarily on spaces can make it less readable.

For example, I was going through some tests in Jinja, and came across strings like:

class TestStreaming(object):
    def test_basic_streaming(self, env):
        t = env.from_string(
            "<ul>{% for item in seq %}<li>{{ loop.index }} - {{ item }}</li>"
            "{%- endfor %}</ul>"
        )

That's a single line, not a multiline string. If it was automatically rewrapped at 88 characters, it would look like:

        t = env.from_string(
            "<ul>{% for item in seq %}<li>{{ loop.index }} - {{ item }}</li>{%- endfor "
            "%}</ul>"
        )

The Jinja tag syntax is now split over two lines, making it weird to read, even though the string is still correct overall. I could imagine other cases, such as an HTML tag with a quoted attribute value being split internally, rather than between attributes or tags. Or you might get the last word in a sentence on the next line, when you might manually wrap in on a phrase instead.

In the time since this was opened, I've applied Black to two large code bases, Flask and Werkzeug, and some smaller ones. In all cases, the number of strings that exceeded the length limit was small. Flake8 pointed them out, and it was easy to rewrap them. Additionally, based on the suggestion to wrap doc strings at 72 characters, I've found strings and comments are more readable when wrapped earlier than code.

I suppose if the strings changed significantly and needed to be rewrapped, it would be easier if it happened automatically, but maybe that's the job of a separate tool or pre-commit hook.

@davidism
Copy link

It would not be ideal to have format comments scattered around the code, since I don't use them elsewhere.

I don't think you should discount using SQL strings directly, it's a valid pattern even if there might be better layouts.

I think the suggestion was to only wrap long strings, but to ignore multiline strings that are already shorter than the limit. I'm not sure if that won't run into problems too, but it's an interesting idea.

@bbugyi200
Copy link
Contributor

I think the suggestion was to only wrap long strings, but to ignore multiline strings that are already shorter than the limit. I'm not sure if that won't run into problems too, but it's an interesting idea.

@davidism I don't know that it is feasible to have one without the other here. I have been hesitant to recommend using a configuration option because I know that this project is not big on options, but we could offer a config option for a regex pattern which, when matched, indicates that a string should not be wrapped automatically. We could use a sensible default like SELECT|<.*></.*>|\n which would catch most SQL queries, HTML, and embedded python code.

@davidism
Copy link

davidism commented Nov 18, 2019

I can't imagine a regex pattern option ever being accepted.

We need a maintainer to make a decision. Should Black wrap strings and, if so, how? There's enough discussion to choose from at this point.

@NicolasPA
Copy link

If Black doesn't wrap strings, it should also not change strings that have been wrapped otherwise it's a pain. Currently it can change already wrapped strings, see my previous comment: #182 (comment)

@sapphire-janrain
Copy link

@bbugyi200 Manually split being indicated by an existing separation. It doesn't matter if that was done by Black before or not, it'd still be idempotent.

So the steps would be like:

  1. String (i.e. one quotation, regardless of splits etc) is too long: wrap that one somehow
  2. Two adjacent splits which, in total, are under line length limit: recombine

So in that way, if splits are adjusted manually, Black won't touch them unless those conditions are hit (which would generally be a major adjustment). I think that's fine because if line-breaks are visually important (as in for large SQL) you should probably use docstrings.

@peterjc
Copy link

peterjc commented Nov 29, 2019

I've just filed issue #1183 which might be regarded as a special case of this - long string literals using slash line continuation (which personally I dislike from a stylistic point of view). Having black fix these would also require black edit string literals.

@bbugyi200
Copy link
Contributor

bbugyi200 commented Dec 1, 2019

@davidism @sapphire-janrain I think I have an agreeable solution here: Wrap only strings that are longer than the specified line length or strings which will be longer than the specified line length after black reformats the file.

Also, IMO this change warrents using spaces at the beginning instead of the end, since some strings will have to be wrapped manually and I agree that spaces at the beginning are better in this case. I will make this change before un-WIPing #1132.

@peterjc Once merged, the PR #1132 should resolve #1183.

@ambv ambv added this to To Do in Stable release via automation Mar 3, 2020
@ambv ambv added the stable label Mar 3, 2020
@ambv
Copy link
Collaborator

ambv commented Mar 3, 2020

While we address this, we should also be cleaning up docstrings as discussed in #144.

@ambv ambv closed this as completed in #1132 May 8, 2020
Stable release automation moved this from To Do to Done May 8, 2020
ambv pushed a commit that referenced this issue May 8, 2020
This pull request's main intention is to wraps long strings (as requested by #182); however, it also provides better string handling in general and, in doing so, closes the following issues:

Closes #26
Closes #182
Closes #933
Closes #1183
Closes #1243
simonwiles added a commit to sul-cidr/sati that referenced this issue Jun 19, 2020
Black reformats these files before flake8 lints them, but:
* it doesn't (yet) wrap the long strings (see psf/black#182);
* it has issues with inserting trailing commas on nested collections, regularly resulting in causing pycodestyle E231 errors.

A case could be made for excluding db migrations from linting altogether, tbh.
simonwiles added a commit to sul-cidr/sati that referenced this issue Jun 19, 2020
Black reformats these files before flake8 lints them, but:
* it doesn't (yet) wrap the long strings (see psf/black#182);
* it has issues with inserting trailing commas on nested collections, regularly resulting in causing pycodestyle E231 errors.

A case could be made for excluding db migrations from linting altogether, tbh.
simonwiles added a commit to sul-cidr/sati that referenced this issue Sep 25, 2020
Black reformats these files before flake8 lints them, but:
* it doesn't (yet) wrap the long strings (see psf/black#182);
* it has issues with inserting trailing commas on nested collections, regularly resulting in causing pycodestyle E231 errors.

A case could be made for excluding db migrations from linting altogether, tbh.
simonwiles added a commit to sul-cidr/sati that referenced this issue Sep 25, 2020
Black reformats these files before flake8 lints them, but:
* it doesn't (yet) wrap the long strings (see psf/black#182);
* it has issues with inserting trailing commas on nested collections, regularly resulting in causing pycodestyle E231 errors.

A case could be made for excluding db migrations from linting altogether, tbh.
simonwiles added a commit to sul-cidr/sati that referenced this issue Sep 25, 2020
Black reformats these files before flake8 lints them, but:
* it doesn't (yet) wrap the long strings (see psf/black#182);
* it has issues with inserting trailing commas on nested collections, regularly resulting in causing pycodestyle E231 errors.

A case could be made for excluding db migrations from linting altogether, tbh.
simonwiles added a commit to sul-cidr/sati that referenced this issue Sep 25, 2020
Black reformats these files before flake8 lints them, but:
* it doesn't (yet) wrap the long strings (see psf/black#182);
* it has issues with inserting trailing commas on nested collections, regularly resulting in causing pycodestyle E231 errors.

A case could be made for excluding db migrations from linting altogether, tbh.
simonwiles added a commit to sul-cidr/sati that referenced this issue Sep 25, 2020
Black reformats these files before flake8 lints them, but:
* it doesn't (yet) wrap the long strings (see psf/black#182);
* it has issues with inserting trailing commas on nested collections, regularly resulting in causing pycodestyle E231 errors.

A case could be made for excluding db migrations from linting altogether, tbh.
akhilnarang added a commit to Rev-AMP/backend that referenced this issue Mar 20, 2021
These strings weren't detected by black - psf/black#182

Signed-off-by: Akhil Narang <me@akhilnarang.dev>
felker added a commit to felker/deephyper_pytorch_layers that referenced this issue Jul 19, 2021
Dont use rev: stable
suggestion from https://github.com/psf/black/blob/main/docs/integrations/source_version_control.md

psf/black#420

Also black wont fix comments that are too long
psf/black#1713
psf/black#182

Must manually fix these to stop flake8 from complaining
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
T: style What do we want Blackened code to look like?
Projects
Development

Successfully merging a pull request may close this issue.