Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Not exploding long arguments list makes diffs unnecessarily long #404

Closed
nickdavies opened this issue Jul 13, 2018 · 9 comments
Closed

Not exploding long arguments list makes diffs unnecessarily long #404

nickdavies opened this issue Jul 13, 2018 · 9 comments
Labels
T: style What do we want Blackened code to look like?

Comments

@nickdavies
Copy link

We recently switched to using black and overall it has been great but one style decision is making code reviews much harder for our team and seems to complicate reasoning about the style.

If i have:

my_func(long_kw_argument="some_reasonably_long_string", short_kw="short_kw_1")

I add short_kw2=5 I get:

my_func(
    long_kw_argument="some_reasonably_long_string", short_kw="short_kw_1", short_kw2=5
)

with diff:

-my_func(long_kw_argument="some_reasonably_long_string", short_kw="short_kw_1")
+my_func(
+    long_kw_argument="some_reasonably_long_string", short_kw="short_kw_1", short_kw2=5
+)

then I add new_arg=1 and I get:

my_func(
    long_kw_argument="some_reasonably_long_string",
    short_kw="short_kw_1",
    short_kw2=5,
    new_arg=1,
)

and this simple change of adding a single argument makes the diff of:

@@ -3,5 +3,8 @@ def my_func(**kwargs):
 
 
 my_func(
-    long_kw_argument="some_reasonably_long_string", short_kw="short_kw_1", short_kw2=5
+    long_kw_argument="some_reasonably_long_string",
+    short_kw="short_kw_1",
+    short_kw2=5,
+    new_arg=1,
 )

This special case means that there are now two different times that you have to deal with the bigger diff which means it comes up a lot more often than it needs to.

One of the stated goals of Black is:

Black makes code review faster by producing the smallest diffs possible.

but this seems like a case where this is not true and is instead optimizing for a few newlines which I believe is much less important that diff readability.

It also feels like this is an unnecessary special case that only complicates understanding of the style and it would be cleaner to just say "black will put it on one line if it fits, if it doesn't its 1 per line with trailing commas" instead of "black will put it on 1 line if it fits, if the arguments alone fit on 1 line when they are indented then it will do that otherwise it will put 1 per line and will only put trailing commas if it spreads it 1 per line not if it puts it on a newline and indents it (plus your explanation for why here)". This would remove the need for you to explain things like #346 (comment) for example.

@sethmlarson
Copy link
Member

Optimizing for diffs and line-length is impossible to do at the same time. At some point you have to introduce line-breaks if you're writing functions that have a ton of parameters.

@nickdavies
Copy link
Author

@SethMichaelLarson: Obviously you have to split the line at some point but the issue here is that by having a special case middle ground between 1 line and 1 argument per line you are making more complicated diffs twice when really it only needs to be done once and you really gain very little out of doing it in two different ways.

@zsol zsol added the T: style What do we want Blackened code to look like? label Jul 19, 2018
@ambv
Copy link
Collaborator

ambv commented Aug 17, 2018

I agree this is inconsistent with the stated goal of minimizing diffs. While it does potentially shadow git blame twice, I don't think adding new arguments to existing calls in succession happens very often in practice (outside of configuration). Is your experience different, @nickdavies?

The reason Black behaves like that is to decouple formatting of the arguments in a function call from the function name's length (and the preceding expression's length in general). This is why we're doing this split only if all arguments on their own fit in a single line. If that still doesn't help, yeah, then we explode to one argument per line.

I'm not religiously attached to this approach, I mostly did it because I thought it's a pragmatic middle ground between exploding everything too soon and the dreadful formattings I've seen before like:

some_variable = some_library.some_function(lets_put_the_first_arg_here, maybe_the_second_too,
                                           but_later_we_need_to_resort_to_splitting,
                                           i_wonder_which_argument_am_i, dunno_but_lets_put_the,
                                           closing_paren_here_so_its_hard_to_find)

Even if we agree that it would be nice to ditch this form, changing it at this point is going to be painful. Black rose in popularity very fast in the past six months and there is a silent majority which is fine with the current formatting. It does save those "few newlines", e.g. makes more code fit on one screen. Ironically, changing the style now would cause further git blame poisoning for existing adopters.

@eccles
Copy link

eccles commented Aug 24, 2018

HI - just looking at black and like it - however this issue seems to be inconsistent.

My preferred style is that as soon as more than one arg is specified then use the column format with trailing commas (i.e. the line length is no longer a consideration in this decision).

With the above policy readability is optimal and diffs consistently readable at the expense of having lots of lines (which is something I prefer). Any chance of considering this as an option?

@nickdavies
Copy link
Author

I agree this is inconsistent ..... Is your experience different, @nickdavies?

I have found this to actually be reasonably common especially when trying to make small iterative diffs with the expressed purpose of making review easier. Also this problem occurs even if it's been a long time since it was changed. It just adds extra burden to review carefully when a small change occurs.

I feel like any time you change a line that is in the middle format is pretty likely that it will either go back to a single line or fully explode out to many.

The .... then we explode to one argument per line.

This logically makes sense but in practice I don't really feel it's worth the distinction and just seems like a strange exception to the otherwise pretty easy to understand and consistent formatting.

I find this particular strange when calling the same function multiple times. Take this argparse example:

parser.add_argument(
    "--foo",
    "--long-foo",
    "-f",
    action="store_true",
    help="help text for foo which is longer",
)
parser.add_argument(
    "--bar", "--long-bar", "-b", action="store_true", help="help text for bar"
)
parser.add_argument(
    "--baz",
    "--long-baz",
    "-z",
    action="store_true",
    help="help text for baz which is longer",
)

I find add_argument is a great example of a function whos line length frequently hovers around the length that you can fit in the half exploded form but if I write a longer help text it will usually go to fully exploded. This gives me two different formats for the same functions in the same block. This wouldn't normally be a problem because for functions that I tend to pass in enough arguments into to fully explode will rarely fit on 1 line and even if it does then 2 different formats is better than 3 in my opinion.

I'm not religiously attached to ... the dreadful formattings I've seen before like:

I am glad we can agree that that formatting is horrid :P

there is a silent majority which is fine with the current formatting.

We obviously don't have any data on this but I would be interested to know how many people are simply happy with the fact that black means they don't think about formatting and it's consistent rather than being particularly attached to the current format.

Also I feel like black already appeals more to those who like the more expanded form. Eg instead of using a format like:

my_list = [
    1, 2, 3,
    4, 5, 6,
]

We use:

my_list = [ 
    1,  
    2,  
    3,  
    4,  
    5,  
    6,  
]

So I would guess (again no data obviously) that this might not actually be a problem for a significant proportion of users.

Ironically, changing the style now would cause further git blame poisoning for existing adopters.

My intention here is less about making blame nice and more about making future diffs easier to read. For me being able to just run black *.py and make that its own commit is easy to review/accept.

I do agree that if you use blame frequently this would suck to have it happen again but if you care a lot about blame quality then this exception in the long run makes it happen all the time just in small bits spread over many commits as lines change length ever so slightly which makes it harder to jump past than one cutover diff.

until the formatter becomes stable, you should expect some formatting to change in the future.

From the front page you even call out that the format may evolve over time but I do agree that it's not without cost.

@jasonkuhrt
Copy link

jasonkuhrt commented Sep 5, 2018

I've been enjoying black so far, thanks! This issue is my main gripe with it.

I have found the way argument lists are treated to be volatile to write and read. Rather than being able to rely on a consistent style in my codebase I have three (more?) possible styles and their reason for changing is coupled to the happenstance of parameter names/counts of said various functions. As a writer and reader of code I personally do not like this.

I proposed in #499 to allow a user-supplied hint (trailing comma) to adjust the formatting to explosion style explicitly without resorting to config.

FWIW @eccles I completely agree with you.

@ambv
Copy link
Collaborator

ambv commented May 7, 2019

#826 addresses this.

@feluxe
Copy link

feluxe commented Nov 22, 2019

I just realized that #826 does only work for argument lists. It would be awesome if it was working for other "trailing comma situations" as well, IMHO. I think #1054 is a good description of what is still missing regarding diffs, code-reviews, etc.

With yapf it is possible to manually explode comma separated items which are surrounded by any kind of bracket into multiple lines by adding a trailing comma behind the last item. This is a really useful feature, which I miss a lot in black.

@ambv
Copy link
Collaborator

ambv commented Mar 3, 2020

This will be handled as part of #1288.

@ambv ambv closed this as completed Mar 3, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
T: style What do we want Blackened code to look like?
Projects
None yet
Development

No branches or pull requests

7 participants