Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch to code formatter Black #267

Merged
merged 5 commits into from Jun 21, 2021
Merged

Conversation

peternowee
Copy link
Member

Proposal for the introduction of the Black code formatter, which promises to save the developers of pydot time and mental energy by taking control over code formatting.

See the earlier discussion in #265 and the individual commit messages in this PR pydot/pydot#267.

@ankostis @prmtl @hrishikeshrt Ok, I gave in on the double quotes... but not on the line lengths yet. 馃槃 Hope you can have a look and let me know what you think.

- Add/remove trailing commas in some collections to ensure/prevent
  Black explodes the contents to a single item per line.
- Wrap the `*_ATTRIBUTES` constants at the top of `pydot.py` in
  `# fmt: off/on` pragmas to prevent they get exploded.
- Merge some string literals that Black will put on a single line
  anyway.
- Two unrelated spelling and comment formatting corrections.
[Black, The Uncompromising Code Formatter][1] promises to save the
developers of `pydot` time and mental energy by taking control over
code formatting. With this commit, we will start to find out.

With `black` installed, you can now run `black .` in the top `pydot`
directory and any formatting issues in pydot source files should get
corrected automatically:

    ~/Development/pydot $ black .
    reformatted dot_parser.py
    reformatted setup.py
    reformatted test/pydot_unittest.py
    reformatted pydot.py
    All done!
    4 files reformatted.

Check out the documentation for more on the [Black code style][2] and
options for [integrating it with your editor or git pre-commit][3].

This commit contains:
- New `pyproject.toml`: Configuration of `black`.
- Changed `.py` files: Results of the first run of `black` 21.5b2.

Line length is kept at 79 characters for now. This is my personal
preference because it allows for reasonable font size when using a
phone, side-by-side diffs on a small laptop screen or 3-way diffs on
larger screens.

For easier review of the formatting changes made by `black`, this
commit uses the `--skip-string-normalization` flag to prevent that
single quotes are converted to double quotes already. The next commit
will drop that flag, so the results of that conversion are shown there.

A later commit will take care of the CI integration.

Thanks to Kostis Anagnostopoulos for first suggesting `black` and to
Sebastian Kalinowski and Hrishikesh Terdalkar for giving their reviews
as well.

Alternatives that were rejected:
- `yapf`: Not clearly better formatting than `black`. More
  configuration options, but the cost of reaching agreement on all
  those options might offset the benefits of using a code formatter.
- `autopep8`: Focusses mainly on PEP 8 compliance, meaning that a lot
  of other formatting issues would still need to be decided on
  manually, reducing the benefits of using a code formatter.

As discussed in [pydot#265][265] and [pydot#267][267].

[1]: https://pypi.org/project/black/
[2]: https://black.readthedocs.io/en/latest/the_black_code_style/current_style.html
[3]: https://black.readthedocs.io/en/latest/integrations/index.html
[265]: pydot#265
[267]: pydot#267
Let `black` take care of the quotes as well, meaning a switch from
single quotes to double quotes (except for where it would result in
more backslash escapes). Currently, `black` can only be configured to
either enforce double quotes or skip this altogether, not to enforce
single quotes.

I had some doubts about this switch at first, but noticed that several
other projects and their documentation examples have switched to using
double quotes now. Holding on to a consistent use of single quotes
could become more of a burden, while offering only little benefit.

Changes from single quotes to double quotes that were made manually:
- Examples in `README.md`.
- `pydot.py`, the `*_ATTRIBUTES` constants at the top, because they are
  surrounded by `# fmt: off` and `# fmt: on` pragmas.

Note this is all about quotes around Python strings in the pydot source
code and not at all related to quotes that are part of a DOT string.
Usage of `extras_require` suggested by Kostis Anagnostopoulos:
- pydot#265 (comment)
@ankostis
Copy link

(haven't look at the PR code yet)

As i've written before, double-quotes is like "sealing" the code, guaranteed that no human has touch it since black last run.

I agree the line-length is a choice the developers should be allowed to make, because certain programms may need too frequently longer lines.
I personally like the idea of e -ish flexibilty, though i have not delved into the specifics, when this flexibility kicks-in - not that sometimes when a long line has been split, it will not return back to its short form.

For instance, assuming the next line is too long:

foo(a, b, c, d)

... then it becomes:

foo(
    a,
    b,
    c,
    d,
)

... and the last comma(,) prevents it from folding back, even if b & c are removed:

foo(
    a,
    d,
)

I guess the same behavior may undo some of the -sh flexibility (if it ever existed in black).

@ankostis
Copy link

ankostis commented Jun 13, 2021

  • 65988af.... what an excellent commit, its message and content! Particularly useful the rejection reasoning for the other tools
  • Nice decision to split string-quoting in a separate commit.
  • I would run linting early on: it's the most often to fail, and you shouldn't wait for all tests to finish to discover that you forgot it. But don't know if it should be in its own stage - a failed black should not block further tests, BUT should mark the PR as failing. But that's my own experience.
  • Maybe black --diff spits out too much output to help the developer locate what has been forgotten?

@peternowee
Copy link
Member Author

@ankostis

... and the last comma(,) prevents it from folding back, even if b & c are removed:

foo(
    a,
    d,
)

Yeah, this bothered me as well. There is an option -C/--skip-magic-trailing-comma now (psf/black#1824) that will make it fold back and remove the comma if it all fits on one line again. But then the surprise is that that also affects manually added magic trailing commas, so there will be no way to manually force explosion to one item per line anymore either. Obviously, black cannot tell from a comma if it was placed manually or by black itself, so I guess it is one way or the other. Using -C actually seems more fitting for Black because it removes the room for discussion about manually adding a magic trailing comma or not. Who knows they might make -C the default one day, because things have not been exactly standing still in that area during the last two years already anyway.

I personally like the idea of e -ish flexibilty, though i have not delved into the specifics, when this flexibility kicks-in - not that sometimes when a long line has been split, it will not return back to its short form.
[..]
I guess the same behavior may undo some of the -sh flexibility (if it ever existed in black).

If we are talking about the same thing (-ish flexibility == different behavior in the 10% tolerance zone of the maximum line length as discussed in #265 (comment)) then as far as I can tell, that does not exist at all in black at the moment. It always uses the configured line length to the max, it is never inclined to break earlier when getting near the maximum and it also never goes over it.

But I think I see what you mean: The way most humans would understand "-ish flexibilty" is that there would be room for the human to decide whether to enter that tolerance zone or not, and then not be corrected by black. On the other hand, for black "-ish flexibility" would mean a sophisticated ruleset to decide between entering the tolerance zone or breaking the line earlier, but in the end it would still need to be deterministic and disregard what the human decided. Unless there would be a pre-defined way for the human to influence the desired outcome, like these magic trailing commas, but then that would be detrimental to black's goal of not leaving room for discussion again. So maybe indeed the way most humans would interpret "-ish flexibility" is fundamentally incompatible with the goals of black.

  • 65988af.... what an excellent commit, its message and content! Particularly useful the rejection reasoning for the other tools

Thanks 馃槃 And thank you for reviewing, btw, I really appreciate it!

  • Nice decision to split string-quoting in a separate commit.

A pleasant side effect of changing my mind on double quotes so late. 馃槄

  • I would run linting early on: it's the most often to fail, and you shouldn't wait for all tests to finish to discover that you forgot it. But don't know if it should be in its own stage - a failed black should not block further tests, BUT should mark the PR as failing. But that's my own experience.

Ok, I tried putting the lint (black) stage first and listing it as allowed_failure to prevent it from blocking the test stage, but that also meant the entire build would be marked as "Passed" even if the lint stage had failed.

So, as you suggest, I moved the job for black to the same stage the other jobs use. Now all jobs run almost simultaneously and black finishes minutes before the test suite jobs, without blocking them.

Here are two examples using a configuration like that:

Both builds are marked as "Failed" by Travis CI.

The change is in the new commit 9b3d2b7b and I will squash it with the old one de074105 at merge time using the commit message of the new one.

  • Maybe black --diff spits out too much output to help the developer locate what has been forgotten?

Yeah, I don't know what I can do about that. Dropping the --diff would leave them in the dark completely. Where necessary, I suppose we can help them during the discussion of their PR, or if that does not work just make the formatting changes ourselves in a later commit. Maybe just get some experience with this first and then see what we could change or document better.

Also, code changes filed by beginners are usually quite small, right, so then the diff from black cannot be very large either. It won't be as bad as the huge diffs we are seeing now for Black's initial runs.

@ankostis
Copy link

Maybe black --diff spits out too much output to help the developer locate what has been forgotten?

Also, code changes filed by beginners are usually quite small, right, so then the diff from black cannot be very large either.

Totally agree.

Copy link
Contributor

@hrishikeshrt hrishikeshrt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, I think this is a very well prepared commit, and will (hopefully) put a rest to further formatting concerns/discussions.

@@ -12,14 +12,25 @@
import sys

from pyparsing import (
nestedExpr, Literal, CaselessLiteral,
Word, OneOrMore,
nestedExpr,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One change that according to me is not optimal, if we think about "better readability".

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. Black does this to minimize diffs, but when it results in a lot of lines it becomes disturbing. The alternative would be to wrap it in # fmt: off and # fmt: on pragmas, just like I did for the *_ATTRIBUTES constants at the top of pydot.py. However, I also wanted to keep the number of such exceptions low and thought the problem here (from 8 lines to 19 lines) was not big enough to warrant another exception. Just one of the things we have to accept as the price for switching to black, was my conclusion.


assignment = (ID + equals + righthand_id).setName("assignment")
stmt = (assignment | edge_stmt | attr_stmt |
subgraph | graph_stmt | node_stmt).setName("stmt")
stmt = (
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar concern, but these cases might be more limited than imports.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. Also not big enough to warrant an exception, though.

@@ -50,7 +50,7 @@ start with. Here are 3 common options:
```python
import pydot

graphs = pydot.graph_from_dot_file('example.dot')
graphs = pydot.graph_from_dot_file("example.dot")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Double quote changes are okay, I don't have a strong opinion about quotes either way and if anything, uniformity in this regard would be better enforced by Black

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I remind that uniformity in the quotes reduces commit diffs - nowdays there is no code outside of git.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

uniformity in the quotes reduces commit diffs

As long as people do not touch existing quotes when they make code changes, it should not matter, right? Or you mean when someone makes a code change and at the same time decides to "clean up" the quoting style around that area (as a form of opportunistic refactoring) and then combines it all together in just a single commit? In that case, yes, it would lead to larger diffs and black prevents that.

This adds a new job for running `black` to our Travis CI continuous
integration configuration. See the previous commits for more details on
Black.

If `black` finds a problem with the formatting, its job will be marked
as "Failed". A diff of the required changes can be found on Travis CI
by clicking on job `black` and scrolling down the Job log. You may also
run `black` on your local machine to let it make the corrections for
you.

The new job is added to the default stage (`test`), meaning it will run
alongside the regular test suite jobs. `black` is kept separate from
the test suite, because it only needs to run once, not on multiple
Python versions and architectures.

A failure reported by `black` will not stop the test suite jobs from
running, but will result in the build as a whole to be marked "Failed"
in the end, even if the other jobs all passed.

Using a separate Travis CI "stage" (named `lint`) was attempted, but
considered inadequate:
- Running stage `lint` *after* stage `test` meant long waiting for
  what was actually the fastest job.
- Running stage `lint` *before* stage `test` meant a minor formatting
  issue could prevent the test suite from running at all.
  - Defining stage `lint` as an `allowed_failure` meant its outcome
    would become irrelevant for the outcome of the build as a whole,
    i.e. the build would pass even if `lint` had failed.
@peternowee peternowee merged commit 479bdca into pydot:master Jun 21, 2021
@peternowee
Copy link
Member Author

Ok, merged, thanks guys!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants