Switch to code formatter Black #267

peternowee · 2021-06-12T21:58:39Z

Proposal for the introduction of the Black code formatter, which promises to save the developers of pydot time and mental energy by taking control over code formatting.

See the earlier discussion in #265 and the individual commit messages in this PR pydot/pydot#267.

@ankostis @prmtl @hrishikeshrt Ok, I gave in on the double quotes... but not on the line lengths yet. 😄 Hope you can have a look and let me know what you think.

- Add/remove trailing commas in some collections to ensure/prevent Black explodes the contents to a single item per line. - Wrap the `*_ATTRIBUTES` constants at the top of `pydot.py` in `# fmt: off/on` pragmas to prevent they get exploded. - Merge some string literals that Black will put on a single line anyway. - Two unrelated spelling and comment formatting corrections.

[Black, The Uncompromising Code Formatter][1] promises to save the developers of `pydot` time and mental energy by taking control over code formatting. With this commit, we will start to find out. With `black` installed, you can now run `black .` in the top `pydot` directory and any formatting issues in pydot source files should get corrected automatically: ~/Development/pydot $ black . reformatted dot_parser.py reformatted setup.py reformatted test/pydot_unittest.py reformatted pydot.py All done! 4 files reformatted. Check out the documentation for more on the [Black code style][2] and options for [integrating it with your editor or git pre-commit][3]. This commit contains: - New `pyproject.toml`: Configuration of `black`. - Changed `.py` files: Results of the first run of `black` 21.5b2. Line length is kept at 79 characters for now. This is my personal preference because it allows for reasonable font size when using a phone, side-by-side diffs on a small laptop screen or 3-way diffs on larger screens. For easier review of the formatting changes made by `black`, this commit uses the `--skip-string-normalization` flag to prevent that single quotes are converted to double quotes already. The next commit will drop that flag, so the results of that conversion are shown there. A later commit will take care of the CI integration. Thanks to Kostis Anagnostopoulos for first suggesting `black` and to Sebastian Kalinowski and Hrishikesh Terdalkar for giving their reviews as well. Alternatives that were rejected: - `yapf`: Not clearly better formatting than `black`. More configuration options, but the cost of reaching agreement on all those options might offset the benefits of using a code formatter. - `autopep8`: Focusses mainly on PEP 8 compliance, meaning that a lot of other formatting issues would still need to be decided on manually, reducing the benefits of using a code formatter. As discussed in [pydot#265][265] and [pydot#267][267]. [1]: https://pypi.org/project/black/ [2]: https://black.readthedocs.io/en/latest/the_black_code_style/current_style.html [3]: https://black.readthedocs.io/en/latest/integrations/index.html [265]: pydot#265 [267]: pydot#267

Let `black` take care of the quotes as well, meaning a switch from single quotes to double quotes (except for where it would result in more backslash escapes). Currently, `black` can only be configured to either enforce double quotes or skip this altogether, not to enforce single quotes. I had some doubts about this switch at first, but noticed that several other projects and their documentation examples have switched to using double quotes now. Holding on to a consistent use of single quotes could become more of a burden, while offering only little benefit. Changes from single quotes to double quotes that were made manually: - Examples in `README.md`. - `pydot.py`, the `*_ATTRIBUTES` constants at the top, because they are surrounded by `# fmt: off` and `# fmt: on` pragmas. Note this is all about quotes around Python strings in the pydot source code and not at all related to quotes that are part of a DOT string.

Usage of `extras_require` suggested by Kostis Anagnostopoulos: - pydot#265 (comment)

ankostis · 2021-06-13T07:43:50Z

(haven't look at the PR code yet)

As i've written before, double-quotes is like "sealing" the code, guaranteed that no human has touch it since black last run.

I agree the line-length is a choice the developers should be allowed to make, because certain programms may need too frequently longer lines.
I personally like the idea of e -ish flexibilty, though i have not delved into the specifics, when this flexibility kicks-in - not that sometimes when a long line has been split, it will not return back to its short form.

For instance, assuming the next line is too long:

foo(a, b, c, d)

... then it becomes:

foo(
    a,
    b,
    c,
    d,
)

... and the last comma(,) prevents it from folding back, even if b & c are removed:

foo(
    a,
    d,
)

I guess the same behavior may undo some of the -sh flexibility (if it ever existed in black).

ankostis · 2021-06-13T07:58:01Z

65988af.... what an excellent commit, its message and content! Particularly useful the rejection reasoning for the other tools
Nice decision to split string-quoting in a separate commit.
I would run linting early on: it's the most often to fail, and you shouldn't wait for all tests to finish to discover that you forgot it. But don't know if it should be in its own stage - a failed black should not block further tests, BUT should mark the PR as failing. But that's my own experience.
Maybe black --diff spits out too much output to help the developer locate what has been forgotten?

peternowee · 2021-06-14T08:41:52Z

@ankostis

... and the last comma(,) prevents it from folding back, even if b & c are removed:
foo(
    a,
    d,
)

Yeah, this bothered me as well. There is an option -C/--skip-magic-trailing-comma now (psf/black#1824) that will make it fold back and remove the comma if it all fits on one line again. But then the surprise is that that also affects manually added magic trailing commas, so there will be no way to manually force explosion to one item per line anymore either. Obviously, black cannot tell from a comma if it was placed manually or by black itself, so I guess it is one way or the other. Using -C actually seems more fitting for Black because it removes the room for discussion about manually adding a magic trailing comma or not. Who knows they might make -C the default one day, because things have not been exactly standing still in that area during the last two years already anyway.

I personally like the idea of e -ish flexibilty, though i have not delved into the specifics, when this flexibility kicks-in - not that sometimes when a long line has been split, it will not return back to its short form.
[..]
I guess the same behavior may undo some of the -sh flexibility (if it ever existed in black).

If we are talking about the same thing (-ish flexibility == different behavior in the 10% tolerance zone of the maximum line length as discussed in #265 (comment)) then as far as I can tell, that does not exist at all in black at the moment. It always uses the configured line length to the max, it is never inclined to break earlier when getting near the maximum and it also never goes over it.

But I think I see what you mean: The way most humans would understand "-ish flexibilty" is that there would be room for the human to decide whether to enter that tolerance zone or not, and then not be corrected by black. On the other hand, for black "-ish flexibility" would mean a sophisticated ruleset to decide between entering the tolerance zone or breaking the line earlier, but in the end it would still need to be deterministic and disregard what the human decided. Unless there would be a pre-defined way for the human to influence the desired outcome, like these magic trailing commas, but then that would be detrimental to black's goal of not leaving room for discussion again. So maybe indeed the way most humans would interpret "-ish flexibility" is fundamentally incompatible with the goals of black.

65988af.... what an excellent commit, its message and content! Particularly useful the rejection reasoning for the other tools

Thanks 😄 And thank you for reviewing, btw, I really appreciate it!

Nice decision to split string-quoting in a separate commit.

A pleasant side effect of changing my mind on double quotes so late. 😅

I would run linting early on: it's the most often to fail, and you shouldn't wait for all tests to finish to discover that you forgot it. But don't know if it should be in its own stage - a failed black should not block further tests, BUT should mark the PR as failing. But that's my own experience.

Ok, I tried putting the lint (black) stage first and listing it as allowed_failure to prevent it from blocking the test stage, but that also meant the entire build would be marked as "Passed" even if the lint stage had failed.

So, as you suggest, I moved the job for black to the same stage the other jobs use. Now all jobs run almost simultaneously and black finishes minutes before the test suite jobs, without blocking them.

Here are two examples using a configuration like that:

black fail + test suite pass
https://travis-ci.com/github/peternowee/pydot/builds/228820602
black pass + test suite fail
https://travis-ci.com/github/peternowee/pydot/builds/228820429

Both builds are marked as "Failed" by Travis CI.

The change is in the new commit 9b3d2b7b and I will squash it with the old one de074105 at merge time using the commit message of the new one.

Maybe black --diff spits out too much output to help the developer locate what has been forgotten?

Yeah, I don't know what I can do about that. Dropping the --diff would leave them in the dark completely. Where necessary, I suppose we can help them during the discussion of their PR, or if that does not work just make the formatting changes ourselves in a later commit. Maybe just get some experience with this first and then see what we could change or document better.

Also, code changes filed by beginners are usually quite small, right, so then the diff from black cannot be very large either. It won't be as bad as the huge diffs we are seeing now for Black's initial runs.

ankostis · 2021-06-14T10:57:47Z

Maybe black --diff spits out too much output to help the developer locate what has been forgotten?

Also, code changes filed by beginners are usually quite small, right, so then the diff from black cannot be very large either.

Totally agree.

hrishikeshrt

Overall, I think this is a very well prepared commit, and will (hopefully) put a rest to further formatting concerns/discussions.

hrishikeshrt · 2021-06-15T08:58:30Z

dot_parser.py

@@ -12,14 +12,25 @@
 import sys

 from pyparsing import (
-    nestedExpr, Literal, CaselessLiteral,
-    Word, OneOrMore,
+    nestedExpr,


One change that according to me is not optimal, if we think about "better readability".

Agreed. Black does this to minimize diffs, but when it results in a lot of lines it becomes disturbing. The alternative would be to wrap it in # fmt: off and # fmt: on pragmas, just like I did for the *_ATTRIBUTES constants at the top of pydot.py. However, I also wanted to keep the number of such exceptions low and thought the problem here (from 8 lines to 19 lines) was not big enough to warrant another exception. Just one of the things we have to accept as the price for switching to black, was my conclusion.

hrishikeshrt · 2021-06-15T08:59:37Z

dot_parser.py


        assignment = (ID + equals + righthand_id).setName("assignment")
-        stmt = (assignment | edge_stmt | attr_stmt |
-                subgraph | graph_stmt | node_stmt).setName("stmt")
+        stmt = (


Similar concern, but these cases might be more limited than imports.

Yes. Also not big enough to warrant an exception, though.

hrishikeshrt · 2021-06-15T09:02:38Z

README.md

@@ -50,7 +50,7 @@ start with. Here are 3 common options:
    ```python
    import pydot

-    graphs = pydot.graph_from_dot_file('example.dot')
+    graphs = pydot.graph_from_dot_file("example.dot")


Double quote changes are okay, I don't have a strong opinion about quotes either way and if anything, uniformity in this regard would be better enforced by Black

I remind that uniformity in the quotes reduces commit diffs - nowdays there is no code outside of git.

uniformity in the quotes reduces commit diffs

As long as people do not touch existing quotes when they make code changes, it should not matter, right? Or you mean when someone makes a code change and at the same time decides to "clean up" the quoting style around that area (as a form of opportunistic refactoring) and then combines it all together in just a single commit? In that case, yes, it would lead to larger diffs and black prevents that.

This adds a new job for running `black` to our Travis CI continuous integration configuration. See the previous commits for more details on Black. If `black` finds a problem with the formatting, its job will be marked as "Failed". A diff of the required changes can be found on Travis CI by clicking on job `black` and scrolling down the Job log. You may also run `black` on your local machine to let it make the corrections for you. The new job is added to the default stage (`test`), meaning it will run alongside the regular test suite jobs. `black` is kept separate from the test suite, because it only needs to run once, not on multiple Python versions and architectures. A failure reported by `black` will not stop the test suite jobs from running, but will result in the build as a whole to be marked "Failed" in the end, even if the other jobs all passed. Using a separate Travis CI "stage" (named `lint`) was attempted, but considered inadequate: - Running stage `lint` *after* stage `test` meant long waiting for what was actually the fastest job. - Running stage `lint` *before* stage `test` meant a minor formatting issue could prevent the test suite from running at all. - Defining stage `lint` as an `allowed_failure` meant its outcome would become irrelevant for the outcome of the build as a whole, i.e. the build would pass even if `lint` had failed.

peternowee · 2021-06-21T17:21:15Z

Ok, merged, thanks guys!

peternowee added 4 commits June 9, 2021 22:13

Switch to code formatter Black: Dev dependencies

9377e3b

Usage of `extras_require` suggested by Kostis Anagnostopoulos: - pydot#265 (comment)

peternowee mentioned this pull request Jun 15, 2021

PEP8 compliance in pydot.py #265

Merged

hrishikeshrt reviewed Jun 15, 2021

View reviewed changes

peternowee added enhancement internals labels Jun 15, 2021

peternowee added this to the 2.0.0 (or later) milestone Jun 15, 2021

peternowee force-pushed the feature/black branch from 9b3d2b7 to 479bdca Compare June 21, 2021 15:46

peternowee merged commit 479bdca into pydot:master Jun 21, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Switch to code formatter Black #267

Switch to code formatter Black #267

peternowee commented Jun 12, 2021

ankostis commented Jun 13, 2021

ankostis commented Jun 13, 2021 •

edited

peternowee commented Jun 14, 2021

ankostis commented Jun 14, 2021

hrishikeshrt left a comment

hrishikeshrt Jun 15, 2021

peternowee Jun 15, 2021

hrishikeshrt Jun 15, 2021

peternowee Jun 15, 2021

hrishikeshrt Jun 15, 2021

ankostis Jun 15, 2021

peternowee Jun 15, 2021

peternowee commented Jun 21, 2021

Switch to code formatter Black #267

Switch to code formatter Black #267

Conversation

peternowee commented Jun 12, 2021

ankostis commented Jun 13, 2021

ankostis commented Jun 13, 2021 • edited

peternowee commented Jun 14, 2021

ankostis commented Jun 14, 2021

hrishikeshrt left a comment

Choose a reason for hiding this comment

hrishikeshrt Jun 15, 2021

Choose a reason for hiding this comment

peternowee Jun 15, 2021

Choose a reason for hiding this comment

hrishikeshrt Jun 15, 2021

Choose a reason for hiding this comment

peternowee Jun 15, 2021

Choose a reason for hiding this comment

hrishikeshrt Jun 15, 2021

Choose a reason for hiding this comment

ankostis Jun 15, 2021

Choose a reason for hiding this comment

peternowee Jun 15, 2021

Choose a reason for hiding this comment

peternowee commented Jun 21, 2021

ankostis commented Jun 13, 2021 •

edited