Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use optional parentheses more often #2156

Open
cdce8p opened this issue Apr 27, 2021 · 33 comments
Open

Use optional parentheses more often #2156

cdce8p opened this issue Apr 27, 2021 · 33 comments
Labels
F: parentheses Too many parentheses, not enough parentheses, and so on. F: symmetry Fixing this would require Black to understand symmetry. T: style What do we want Blackened code to look like?

Comments

@cdce8p
Copy link
Contributor

cdce8p commented Apr 27, 2021

Describe the style change

For short if statements (just one and/or) black tries to get rid of optional parentheses whenever possible. One common result of that behavior is that the line is instead spilt on a function call which often reduces the readability compared to optional parentheses. Even if the optional parentheses are inserted manually, black will reformat the code.

Examples in the current Black style with desired style

Some examples taken from the pylint code base.

if not is_node_in_type_annotation_context(node) and isinstance(
    node.parent, astroid.Subscript
):
    pass

# better
if (
    not is_node_in_type_annotation_context(node)
    and isinstance(node.parent, astroid.Subscript)
):
    pass
def _supports_mapping_protocol(value: astroid.node_classes.NodeNG) -> bool:
    return _supports_protocol_method(
        value, GETITEM_METHOD
    ) and _supports_protocol_method(value, KEYS_METHOD)

# better
def _supports_mapping_protocol(value: astroid.node_classes.NodeNG) -> bool:
    return (
        _supports_protocol_method(value, GETITEM_METHOD)
        and _supports_protocol_method(value, KEYS_METHOD)
    )
                    if getattr(
                        func, "name", None
                    ) == "iter" and utils.is_builtin_object(func):

# better
                    if (
                        getattr(func, "name", None) == "iter"
                        and utils.is_builtin_object(func)
                    ):

Additional context
For any statement with more at least two priority delimiters the above suggested syntax is already the default.

return (
    is_postponed_evaluation_enabled(node)
    and value.qname() in SUBSCRIPTABLE_CLASSES_PEP585
    and is_node_in_type_annotation_context(node)
)

One possible implementation could be the modify these lines here and use > 0 instead of > 1.

black/src/black/__init__.py

Lines 6751 to 6754 in 76e1602

max_priority = bt.max_delimiter_priority()
if bt.delimiter_count_with_priority(max_priority) > 1:
# With more than one delimiter of a kind the optional parentheses read better.
return False

@cdce8p cdce8p added the T: style What do we want Blackened code to look like? label Apr 27, 2021
@friendyou1

This comment has been minimized.

@ichard26

This comment has been minimized.

@ichard26 ichard26 added the F: parentheses Too many parentheses, not enough parentheses, and so on. label Apr 28, 2021
@cdce8p
Copy link
Contributor Author

cdce8p commented Apr 28, 2021

If someone is interested, I've opened #2163 which would address it.

@felix-hilden
Copy link
Collaborator

The linked PR had positive feedback from maintainers but was rejected because of the disruption it would cause. So should this issue be closed by extension, or is there still a discussion to be had about this suggestion?

Marc did ask if keeping user-inserted parentheses would be an option. In my opinion the style would indeed be an improvement. But is it inappropriate with respect to Black's philosophy or reasonable pragmatism?

@cdce8p
Copy link
Contributor Author

cdce8p commented May 16, 2021

I've created a followup PR #2237 that would keep user inserted optional parentheses

@felix-hilden
Copy link
Collaborator

Based on my recent ride around the issue tracker, it is evident that this is a highly-requested feature. I'll summarise Łukasz' remarks when closing the initial PR:

  • the change multiplies the number of lines sources would have after formatting
  • it doesn't match the way people write code (naturally splitting lines where you are rather than going back)

Łukasz ended with saying he's also disappointed at the fact that Black is not internally consistent in this regard.

If I may, I'll try to push this closer to a resolution once more.

Keeping user-inserted parentheses

I agree that this is a suboptimal solution, since it is akin to the magic trailing comma and comes with all the similar problems.

Matching the way people write code

I think this is not that important given Black's function. We ruthlessly splits lines, change quotations and add or remove other parentheses. Users "cede control over formatting" to Black, they shouldn't expect Black to improve their current desired (lack of) style.

Disrupting existing code

This is the biggie. If we can agree on a good set of rules to actually do this, it could be tested on the primer projects to see exactly how much disruption it would cause. But getting too mathematical is not the point either.

Is the proposed style better?

There seems to be no question that this is the better style in the examples that were given:

if not is_node_in_type_annotation_context(node) and isinstance(
    node.parent, astroid.Subscript
):
    pass

# VS

if (
    not is_node_in_type_annotation_context(node)
    and isinstance(node.parent, astroid.Subscript)
):
    pass

But it is not that obvious in more complex cases like in our own tests (example taken from the original PR test code modifications):

if prevp.parent and prevp.parent.type in {
    syms.typedargslist,
    syms.varargslist,
    syms.parameters,
    syms.arglist,
    syms.argument,
}:

# VS (what cdce8p wrote)

if (
    prevp.parent
    and prevp.parent.type
    in {
        syms.typedargslist,
        syms.varargslist,
        syms.parameters,
        syms.arglist,
        syms.argument,
    }
):

# VS (another possibility, which I kinda like)

if (
    prevp.parent
    and prevp.parent.type in {
        syms.typedargslist,
        syms.varargslist,
        syms.parameters,
        syms.arglist,
        syms.argument,
    }
):

But the problem is more general than if statements. I closed lots of similar issues above, but I feel like even #236 and #348 are closely related to this. So I think we should really come up with a careful set of rules, or better yet: examples of what parentheses should look like if we're going to continue pursuing this any further.

To that end, I'd like to ask from the maintainers and especially @ambv: has the ship sailed? Are you absolutely opposed to any changes of this sort? If yes, I don't think we should waste any more resources discussing it. But if not, I'd like to thoroughly investigate the possibilities and come up with something that could be worth doing.

@rmorshea
Copy link

rmorshea commented Nov 4, 2021

Matching the way people write code

This is less about how people write code and more about how people read it. The need for a tool like Black tells use that the way people write code is actually inconsistent and messy. Users choose to cede control over formatting to Black because the readability of the resulting code is, in 99% of cases, as good or better than it was before. It seems evident that this problem case makes up a disproportionate part of that 1% of outputs which have subpar readability.

Is the proposed style better?

In my opinion, the resulting code in the given counter examples is not worse than it was before. If it turned out that this was representative of the worst case scenario, I don't think asking whether the proposed rules make improvements in all cases is really what we should focus on. Rather, we should evaluate whether the improvements which we already know about outweigh the costs of the disruption these changes would cause.

@rmorshea
Copy link

rmorshea commented Nov 4, 2021

I think it's also worth noting that much of the discussion around this change happened before the new stability policy. The --future flag could be a great opportunity to introduce this change and collect feedback without any real disruption. If it turns out from the feedback that this does more harm than good, we can just revert it before the next annual release.

@LLyaudet
Copy link
Contributor

I just add my voice in favor of this issue with the following example :

                mnopqrstuvwxyza = (
                    set(stuvwxyz_abcdefghi[pqrstuvwxy].keys())
                    - set(stuvwxyz_abcdefghij[pqrstuvwxy].keys())
                )

black currently suggests this modification that is sub-optimal for readability :

-                mnopqrstuvwxyza = (
-                    set(stuvwxyz_abcdefghi[pqrstuvwxy].keys())
-                    - set(stuvwxyz_abcdefghij[pqrstuvwxy].keys())
+                mnopqrstuvwxyza = set(stuvwxyz_abcdefghi[pqrstuvwxy].keys()) - set(
+                    stuvwxyz_abcdefghij[pqrstuvwxy].keys()
                 )

I did not check if the PR handles this case also, but I hope it is the case :)

@cdce8p
Copy link
Contributor Author

cdce8p commented Jan 15, 2022

Though I post a small update. Unfortunately, I don't have time to work on it anytime soon. If someone else is interested in picking this up, please feel free to do so!

@xM8WVqaG
Copy link

xM8WVqaG commented Feb 1, 2022

Personally, I think your newer example format is cleaner.

a = (
  bbbbbb(1234567890)
  + cccccc(123456790)
)

means you only have complete pieces of logic on each line rather than breaking inconsistently part way through functions.

It's easier to read complete sentences when they're together, and it creates more legible diffs:

--- a/example.py
+++ b/example.py
@@ -1,4 +1,4 @@
 a = (
   bbbbbb(1234567890)
-  + cccccc(123456790)
+  - dddddd(123456790)
 )

rather than

--- a/example.py
+++ b/example.py
@@ -1,3 +1,3 @@
-a = bbbbbb(123456790) + cccccc(
+a = bbbbbb(123456790) - dddddd(
     123456790
 )

@ambv
Copy link
Collaborator

ambv commented Feb 1, 2022

@felix-hilden it's confusing to cram both issues here. Let's focus here on the single-operator parentheses.

@xM8WVqaG Black will prefer fewer vertical lines over more, if it can be helped. Unfortunately, changing this will affect a lot of code which now formats okay and will format worse. It's a relatively easy change for you to make to your Black installation:

  1. Install it with pip install --no-binary=:all: black.
  2. Run python3 -c 'import black.lines; print(black.lines.__file__)'.
  3. Open the file under the path returned by the last command.
  4. Edit the line that says:
    if bt.delimiter_count_with_priority(max_priority) > 1:

to say > 0 instead. Test with as much code as you can lay your hands on. You'll see some regressions in behavior.

One more thing to note here that you won't be aware of unless you actually contributed to Black. Black formats everything (with one exception that is irrelevant here) in incoming token order, which usually is "left to right". This produces regular and predictable formatting because it doesn't "look forward" to try to reformat things and then decide whether they are nice or not.

In turn, it won't know the difference between, say:

a = (
    bbbbbb(123456790)
    + cccccc(
        1234567890
        + 1234567890
   )
)

instead of the much more reasonable

a = bbbbbb(123456790) + cccccc(
    1234567890 + 1234567890
)

@cdce8p
Copy link
Contributor Author

cdce8p commented Feb 1, 2022

It's difficult since it would effect almost every project that uses black today. However, just from seeing how many related issues were created, it looks like many users would like an improvement (me included).

Just from the initial testing I've done with #2237, it might be worth limiting any changes to LOGIC_PRIORITY or higher for now. There are still some corner cases with that, but much fewer than when including all operators.

@rmorshea
Copy link

rmorshea commented Feb 1, 2022

While many people do report this, it's hard to know where the majority opinion lies just by looking at the number of issues. The people who are satisfied with the current behavior aren't posting issues, and thus, are not participating in this conversation. I'd personally appreciate some change here, but it's important to remember that there are a lot of black users whose viewpoints are not being represented.

@LLyaudet
Copy link
Contributor

LLyaudet commented Feb 2, 2022

The "hierarchical first" line-split seems always better to me. If not chosen as the sensible default, it would be nice to have it as a configuration option.

@felix-hilden
Copy link
Collaborator

I think whatever is decided here, we won't have a configuration option. We don't want to give any unnecessary control over formatting.

@MarkusPiotrowski
Copy link

@ambv I'm sorry, but for me the 'counter-examples' that you show look better/are easier to read than what black is suggesting.
Maybe there are different ways how different people read code. But especially in your two examples:

a = bbbbbb(123456790) + cccccc(
    123456790
)

a = bbbbbb(123456790) + cccccc(
    1234567890 + 1234567890
)

the closing parentheses are visually completely dis-aligned to the code which they enclose, because they are vertically and horizontally separated from their opening parenthesis.

It is not only that, if you need line breaks, one would prefer to have both objects of an operation horizontally aligned and having complete pieces of logic on one line, but also about the quick identification of matching parentheses and their enclosed content.

@lig
Copy link

lig commented Feb 4, 2022

@felix-hilden here is a possible take on the example from tests

if prevp.parent and prevp.parent.type in {
    syms.typedargslist,
    syms.varargslist,
    syms.parameters,
    syms.arglist,
    syms.argument,
}:

# start from scratch

if prevp.parent and prevp.parent.type in {syms.typedargslist, syms.varargslist, syms.parameters, syms.arglist, syms.argument,}:

# add parentheses

if (
    prevp.parent
    and prevp.parent.type in {syms.typedargslist, syms.varargslist, syms.parameters, syms.arglist, syms.argument,}
):

# still long, add more

if (
    prevp.parent
    and (
        prevp.parent.type
        in {syms.typedargslist, syms.varargslist, syms.parameters, syms.arglist, syms.argument,}
    )
):

# still long

if (
    prevp.parent
    and (
        prevp.parent.type
        in {
            syms.typedargslist,
            syms.varargslist,
            syms.parameters,
            syms.arglist,
            syms.argument,
        }
    )
):

# try to inline in reverse

if (
    prevp.parent
    and (
        prevp.parent.type in {
            syms.typedargslist,
            syms.varargslist,
            syms.parameters,
            syms.arglist,
            syms.argument,
        }
    )
):

@felix-hilden this is kinda similar to your last option. Frankly, it looks like it could be inlined once more and it will be exactly it. Anyway, putting and and in on the same level here isn't right IMO.

@ktbarrett
Copy link
Contributor

ktbarrett commented Feb 22, 2022

@lig I think you could continue to inline twice more back to the original desired style from the test. I do not think it's unreadable.

A recursive algo which inlines leaves as it returns from the call stack as long as the max line length is not violated is a good way to go about it. The point is to break the outer-most element if the line is too long. It's what my intuition is, and what I would write naturally if I cared. In reality I just let everything go to the right and let black clean it up 😁. I have even written a pretty printer that used that exact algo because pprint looks awful.

Examples of different max-line-lengths of @ambv's example with the stated algo.
# max-line-length=1
aaaaaaaaaa = (
  bbbbbb(
    123456790
  )
  + cccccc(
    1234567890
    + 1234567890
  )
)

# max-line-length=20
aaaaaaaaaa = (
  bbbbbb(123456790)
  + cccccc(
    1234567890
    + 1234567890
  )
)

# max-line-length=40
aaaaaaaaaa = (
  bbbbbb(123456790)
  + cccccc(1234567890 + 1234567890)
)

# max-line-length=60
aaaaaaaaaa = (
  bbbbbb(123456790) + cccccc(1234567890 + 1234567890)
)

# max-line-length=80
aaaaaaaaaa = bbbbbb(123456790) + cccccc(1234567890 + 1234567890)

@gar1t
Copy link

gar1t commented Apr 12, 2023

I would suggest (re)considering the use of a configuration option to balance the interests here. The magic trailing comma is a similar case - it's a syntactically benign way to signal formatting intent.

Some of the formatting examples above are so pathological (as are the cases in our code base - I omit them as the topic is well covered above) that I think it should be a priority to address them in one way or another.

In the case of boolean expressions, adding a trailing and True is similar to the trailing comma - and cleans up the formatting but at the obvious expense.

I don't follow the argument that fewer lines of code trump readability.

Disruption to existing code is obviously a big deal. But if the implementation is to use grouping parenthesis as a cue for the new formatting rules, these would have to be explicitly added and existing code should not be impacted. Again, similar to the trailing comma. And if there's a config option - also similar to trailing comma - there's no impact at all or users could opt out (depending on the default behavior).

@ambv
Copy link
Collaborator

ambv commented Apr 12, 2023

I would suggest (re)considering the use of a configuration option to balance the interests here.

We avoid that here.

I don't follow the argument that fewer lines of code trump readability.

They usually do because they allow more content to fit one screen or page. Black additionally prefers to format things in a way where lines read from left to right instead of pre-emptively exploding into one-element-per-line.

The unclear examples in this open issue are obviously controversial, but a ton of others aren't, and nobody lists them here because they just look good. The current rule requiring "more than one delimiter" to use optional parentheses was added because without it, the output was clearly overusing parentheses leading to explosion of newlines in unexpected places.

The problem with a tool that applies the same rules every time is that in some cases a given rule leads to output that is less obvious. It seems here most pushback stems from using square brackets or parentheses from a call to split a line.

In the case of organizing optional parentheses, there is no easy-to-express rule to make it work for everybody. Most of all, it's to a large extent subjective. What to some is "so pathological that it should be a priority", is not a big deal (or is indeed preferred) by others.

Using existing parentheses as a marker that they should be kept makes the tool less automatic. It will then happily add parentheses in some cases but when you modify the code, it won't take them out. That's already a source of confusion for users when it comes to trailing commas, and adding parentheses to the mix would make this problem worse.

Originally Black didn't automatically put or remove parentheses, it would just respect what it found already in the file. This was widely criticized, I received many requests to change it. This issue argues, in essence, to roll this change back, which would be an unpopular decision.

gar1t added a commit to guildai/guildai that referenced this issue Apr 16, 2023
Black is restrictive as a matter of design/feature and the [driving
philosophy](psf/black#2156 (comment))
seems to favor very limited changes to Black over addressing
extraordinarily bad formatting.

yapf is not perfect as it requires a commenting pattern to readable
formats for long logical and mathematical expressions. It's far more
flexible and succeeds at generating maintainable code in cases that
Black cannot.
@lig
Copy link

lig commented Apr 21, 2023

I would suggest (re)considering the use of a configuration option to balance the interests here.

We avoid that here.

Maybe these couple of cases are those good candidates for having options for them? AFAIR, skip-string-normalization was kinda similarly controversial and it took Guido's opinion to make this into an option. These two are obviously controversial, have different logic already, and confuse both sides.

@Wyko
Copy link

Wyko commented May 16, 2023

Just an extra little bump here. I stumbled on this issue because the logic trees in my application are getting worse and worse due to this formatting choice. If I had the option to cause black to format these more nicely, I'd do it in a heartbeat.

if self._rule_has_same_enviroment_tag(
    rule=rule, live_rule=live_rule
) and self._rule_has_matching_source(rule=rule, live_rule=live_rule):
    return live_rule

# vs

if (
    self._rule_has_same_enviroment_tag(rule=rule, live_rule=live_rule)
    and self._rule_has_matching_source(rule=rule, live_rule=live_rule)
):
    return live_rule

The first one is almost unreadable to me.

@rijenkii
Copy link

rijenkii commented May 26, 2023

Very much needed.
Current behavior is sometimes so unreadable that I have to wrap code in #fmt: on/off.

@itcarroll
Copy link

@JelleZijlstra invited a PR to address #3800: comments that get egregiously moved. The fix could potentially allow commenting to preserve optional parentheses. Not saying that's good or bad, just relevant to this discussion.

Putative "magic comment" in action:

a = (
    bbbbbb(123456790)  #
    + cccccc(123456790)
)

@LucidOne
Copy link

LucidOne commented Aug 5, 2023

I'd like a real solution but at minimum I think this is a problem with the docs.
Similar to #2156 (comment) adding an extra or None at least makes it readable.

_cfg_dir = _os.environ.get("RYO_USER_CONFIG_DIR") or _appdirs.user_config_dir(
    "ryo-iso", "HxR"
)
_cfg_dir = (
    _os.environ.get("RYO_USER_CONFIG_DIR")
    or _appdirs.user_config_dir("ryo-iso", "HxR")
    or None
)

Aside from --line-length 99, I'm still not sure how to improve this sort of thing.

(self.base_image["distro"], self.base_image["version"]) = self.iso_cfg[
    "image"
].split("/")

@0dminnimda
Copy link

That's interesting that adding the third operand actually make thye formatting nice, but with two it's unreadable

return (
    self.context.control_flow.is_conditional_jump(temp)
    or self.context.is_used_temp(temp)
    or None
)
return self.context.control_flow.is_conditional_jump(
    temp
) or self.context.is_used_temp(temp)

Can the behavior be changed to respect two operands as well? It doesn't seem like it will actually negatively affect anything.

@AdamWill
Copy link
Contributor

AdamWill commented Feb 28, 2024

edit: eh, I guess it is fairly similar to some of the other examples in the end. i'll just use fmt off. :/

@MichaReiser
Copy link

This is related to #4123. However, I'm not sure if such a fundamental change is an option at this point.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
F: parentheses Too many parentheses, not enough parentheses, and so on. F: symmetry Fixing this would require Black to understand symmetry. T: style What do we want Blackened code to look like?
Projects
None yet