Unified coding (spacing) style for sympy. #17287

stevenleeS0ht · 2019-07-28T08:56:14Z

It would be awesome if there is a unified coding style defined by this community to make all source code in a same style.

The coding style can also be interpreted to .style.yapf file and parsed by yapf.

it would be wonderful to auto-formatting.

oscargus · 2019-07-28T14:46:39Z

There is an .editor_configfile, https://github.com/sympy/sympy/blob/master/.editorconfig, that shows what is expected. Nothing more, nothing less.

Nothing less may not be entirely true, as the code is expected to be somewhat readable. But the things in the .editorconfig is really what is formally expected (four spaces indent, no trailing spaces, a single empty line at end of file). Maybe some more.

A yapf config may be welcome, but primarily to avoid user running with the default yapf settings...

moorepants · 2019-07-28T17:09:18Z

SymPy devs have a long history of resisting any kind of specific style applied to the code. I'm going to close this. Please reopen if there are specific style changes you'd like to propose.

stevenleeS0ht · 2019-07-29T00:39:56Z

@moorepants , Please reconsider to reopen this issue. Because, a explicit coding convention is essential for readability.

If there is a module which is special in its nature, which need to be coding specially. We can add some exclusion for it. We can simply add a empty .style.yapf in its subdirectory.

We don't need to use single convention for whole project. yapf can use -s option to applied style from certain style file. We can add different style file in to different section (subdirectory) of our project.

I think we need a discussion in this community for a project-specific style. We need to get the opinion from majority.

If there is no coding convention in this project, there is no strict config for auto formation. As a result we need to make our code tidy and readable on our bare hand.

@oscargus , The information in .editconfig is not enough.

The implementation can be flexible enough resolve the concern in the time of not implemented.

moorepants · 2019-07-29T00:56:09Z

I recommend that you search the mailing list and github issues for prior discussions about this to learn the history and state of affairs before making this proposal.

oscargus · 2019-07-29T08:27:15Z

The .editorconfig is what is expected/required. So, well, it is enough from that perspective.

The thing is that sometimes one would like to break the conventions for more readable source. Be aware that the style is checked in the PRs, but primarily manually.

Btw, it looks like yapf doesn't have a config file for its' own project...

MartinThoma · 2020-04-26T10:05:34Z

I find it extremely hard to find the "state of affairs" in the mailing list. I found one discussion which seemed to be mainly / only about LaTeX in docstrings and this from 2015. 2015 is now 5 years ago and a lot might have changed. The discussion was about pure formatting PRs and so it seemed to go in another direction.

I have seen that sympy uses flake8 (travis), but it seems as if no autoformatter is used.

In the last year I got used to the black autoformatter. It can fix many formatting issues completely automatic, never introduced a bug and the style is acceptable. In the beginning it was not what I liked most, but it removed any discussions around coding (formatting) style.

Discussions where I have seen this:

Upgrading to Python3 (2) (pyupgrade to files from benchmarks to crypto) #18877 (comment) - not sure if there was ever a follow-up
Some code clean up #1573 - a pure code formatting PR which was accepted
Improved code style in physics units #12970 - a pur formatting PR which was rejected

moorepants · 2020-04-26T15:56:30Z

@MartinThoma

SymPy generally follows PEP8. There are a mixture of PEP8-ish code styles in the various modules and an author updated a given file should follow the conventions of that file. We do accept PRs that address PEP8 style issues, but they should be done on a per-module or per-package basis to make them easier to review.

I doubt you'll get any traction with black. You can make a proposal on the mailing list or an issue to discuss. But maybe you will at this point in time. For the past 10 years I've been involved, strict style formatting has been blocked, mostly due to authors not likely to agree on a style.

That being said, we did just develop this documentation style guide: https://docs.sympy.org/latest/documentation-style-guide.html. This guide is combination of numpydoc style and sympy specific styles. An effort to make a guide for the code could get traction. But someone has to put in the effort to develop the consensus on what the style rules are that differ from or extend PEP8.

I'll reopen this issue. Feel free to discuss. But until a PR is merged that does autoformatting or introduces a guide, then it will stay the way it is. Make a PR with your proposed change and see if it makes it through review.

moorepants · 2020-04-26T16:00:42Z

Here's another mailing list post: https://groups.google.com/forum/?fromgroups#!searchin/sympy/pep8|sort:date/sympy/Gavo1_dhhT4/mQkGySdVdEkJ

moorepants · 2020-04-26T16:06:36Z

More past discussions:

MartinThoma · 2020-04-26T16:56:06Z

Cool, thank you so much @moorepants ! I haven't been involved in the Sympy development at all so finding those would have been super hard. It might take a bit of time for me to read up on those. Once I did, I'll write a post to the mailing list making my point for a code formatter (and code within documentation formatter; blacken-docs).

moorepants · 2020-04-26T17:04:34Z

I personally am in favor of adopting something like black or having to pass stricter lint tests that the CI runs. One benefit is that we no longer have to discuss formatting in almost every PR. But in a large coebase like SymPy there are going to be lots of exceptions to the rules that may have valid reasons to stay in place. That makes any autoformatting hard to manage and introduce.

oscarbenjamin · 2020-04-26T17:35:55Z

Having cleaned up all flake8 and mypy warnings myself (there were thousands of each at the start) I can tell you that it is a long slog. SymPy has 750000 lines of code and counting. I am also in favour of automated ways to make improvements but applying fixes across the codebase needs to be done gradually and carefully. A lot of the work in fixing flake8/mypy errors was actually because the flake8 warnings were picking up genuine bugs/antipatterns in the codebase and it is not easy to fix that.

Style changes are probably easier but also the gain is less. Improving the style of the documentation rather than the code would be more valuable IMO. The code itself can be cleaned up enormously in many places but the big improvements are not ones that any autoformatter I know of can do.

smichr · 2020-04-26T18:29:16Z

Here's an in-code example of irregular spacing

    if reform or M.is_positive_definite is False:
        H         = M.H
        M         = H.multiply(M)
        rhs       = H.multiply(rhs)
        hermitian = not M.is_symmetric()

I don't write like this but I like how it looks. Often a few people end up editing in a given module more than others. And if they like to edit there and such formatting helps them, I'm happy. I wouldn't want to enforce a rule about alignment of =, however.

It is always annoying to me to have to view long lines so I like the line length limit that is mostly enforced, especially in tests. But since that is infrequently edited I don't loose any sleep about how the lines are broken (e.g. within parentheses or with a trailing backslash).

I would always like to know how to make the user-facing docstrings look better. That's where I agree that we should work toward a high quality form.

MartinThoma · 2020-04-28T19:12:49Z

(I'm currently reading up on the comments and will update this post)

Concerns I've read about introducing new rules (less so about an auto formatter so far):

Some people don't want to have more rules.
There would be a massive merge problem in the transition phase
Raises the entry barrier for new contributors

Regarding those concerns:

(2) Not necessarily. If one has an auto-formatter, one can simply run the autoformatter on the new PR. Yes, this would be annoying for a while to remind people to do it. But it would not be a problem.
(3) Adding rules, yes. Adding an auto-formatter actually lowers the barrier. Instead of having to execute bin/strip_whitespace (see development workflow) one could install a pre-commit hook for the trailing whitespace

Some details:

The test of code quality already performs a check of no trailing-space and just 1 new line at the end of the file in sympy's code.

There were some missed; see #19214

asmeurer · 2020-04-28T19:29:00Z

I pretty strongly feel that we should not use an autoformatter (black completely destroys the formatting of mathematical expressions by the way).

(3) Adding rules, yes. Adding an auto-formatter actually lowers the barrier. Instead of having to execute bin/strip_whitespace (see development workflow) one could install a pre-commit hook for the trailing whitespace

That looks like a higher barrier to me.

Forcing contributors to do more steps just to make a PR is just one of the many reasons why I hate black. Actually it doesn't even have to be that way, you could apply black independently of the user commits, but many projects take the pre-commit route that does work like that.

oscarbenjamin · 2020-04-28T19:29:57Z

For improving the docstrings see #17205 which is about a script used in pandas for enforcing numpydoc style.

redeboer · 2021-10-13T13:43:03Z

Hi everyone, here are some thoughts while reading through the discussions here. I'm relatively new to contributing to SymPy, so apologies in advance if I overlook something or am stating obvious things.

But until a PR is merged that does autoformatting or introduces a guide, then it will stay the way it is. Make a PR with your proposed change and see if it makes it through review.
Unified coding (spacing) style for sympy. #17287 (comment)

How about adding a config file for pre-commit? As mentioned earlier, this may make a script like bin/strip_whitespace redudant; see the trailing-whitespace hook.
Some benefits of pre-commit:
- Pre-commit lowers the barrier: just run pre-commit install after you cloned the repo and all staged files will always be checked/autoformatted upon commiting.
- Pre-commit is fast: only staged/modified files are linted/formatted.
- For the same reason, pre-commit is gradual (see Unified coding (spacing) style for sympy. #17287 (comment)). Note that you can also run (all or specific) hooks over all files.
- Pre-commit is easy to extend: just add additional hooks (overview here). An example would be adding flake8 as a pre-commit hook (see also Use flake8-bugbear on Travis and fix warnings #20068), mypy (Look into using type hints #17945), and pylint (pylint checking #22128). This would also make the the 'flake8 step' in the development workflow redundant.
To lower the barrier even further, you may consider using pre-commit.ci. It's much faster than running pre-commit on GitHub Actions or Travis and can also perform autofixes on PRs. That way, even if someone forgets running pre-commit locally, it's still done automatically through the PR. Even better: existing PRs could be auto-fixed as well, removing any conflict if all files under the master branch were autoformatted through the same pre-commit config.
As for choosing a formatting style: if PEP8 is all that is required Unified coding (spacing) style for sympy. #17287 (comment), add autopep8 as a pre-commit hook. Personally, I would go for black though, as it is opinionated (ends the discussions) and an increasing number of acclaimed projects is using it.
Going beyond formatting and linting, consider defining the development workflow as tox and/or nox jobs. That way, you just have to run tox and all checks under the bin folder are run at once. (Could make a separate issue for this.)

Final remark: I'm strongly in favour of configuring and automating coding conventions as much as possible, instead of describing them in words. Style documentation tends to get outdated, they are inconvenient for the developer (reading, or even memorising?), they leave room for error, they can be ambiguous, and they are hard to discuss and update (compare this with a PR to a simple config file for a linter or formatter).

oscarbenjamin · 2021-10-13T19:15:02Z

Personally, I would go for black though, as it is opinionated (ends the discussions)

It doesn't seem to me that we do have significant problems with ongoing debates about style in actual pull requests, mainly because no hard rules are enforced. At least the things that we tend to go back and forth about are to do with semantics rather than style e.g. "don't use simplify in library code", "don't use overly broad exception handling" or "this should be handled in simplification rather than evaluation".

Where we do get PRs that are badly formatted it is usually because an inexperienced contributor has paid absolutely no attention to style and usually that same contributor has paid little attention to other more important things as well. I don't think that an auto-formatter would solve many of those problems: there's really no substitute for having some appreciation of why style and clarity of code is good and for actually thinking about it when writing code.

Ultimately it's about attention to detail and style is just one relatively unimportant part of that.

asmeurer · 2021-10-14T00:31:23Z

At least the things that we tend to go back and forth about are to do with semantics rather than style e.g. "don't use simplify in library code", "don't use overly broad exception handling" or "this should be handled in simplification rather than evaluation".

It might be good to gather all these together in a document somewhere. Some of these are basic Python things which would apply to any codebase, but some others are SymPy specific things, and in particular, things that are perfectly fine for user code (especially interactive user code), but which are more or less banned in library code, like simplify, var, representing expressions as strings, unnecessary floats instead of rationals, and so on. So contributors might not expected to necessarily know that they should avoid these things.

redeboer · 2021-10-14T08:27:41Z

It sounds like the points raised above are more something for linters (semantics) rather than a formatter*. Formatting can easily be automated; semantics cannot. Linting indeed has to be complemented by style guides, particularly when it comes to sympy conventions.

So my question (#17287 (comment)) comes down to this: how about adopting pre-commit and tox/nox to automate the (local) code quality checks that can be automated?

* Formatting is important for diffs (i.e. code review) and readability (recognisability!) of code patterns, not for design and library performance. In black terms: "Blackened code looks the same regardless of the project you’re reading. Formatting becomes transparent after a while and you can focus on the content instead."

oscarbenjamin · 2021-10-14T09:00:40Z

It sounds like the points raised above are more something for linters (semantics) rather than a formatter*.

Yes, but my point was that to me it doesn't seem that we have significant problems with formatting so I don't see how black would bring an improvement. There are references to the idea that black "ends discussions" about formatting but I don't see lots of discussions about this so I don't see why they need to be ended. It seems to me that the only back and forth discussions we have about formatting are threads like this one where some people propose the idea of bringing in some automated formatting tools and others disagree.

oscarbenjamin · 2021-10-14T09:02:14Z

how about adopting pre-commit and tox/nox to automate the (local) code quality checks that can be automated?

How exactly does nox help here?

There's really just two things to run: bin/test quality and flake8 sympy. We could make a single script for running those two things.

redeboer · 2021-10-14T10:06:55Z

#17287 (comment)
Okay sure, hard for me to judge. As an outsider, I just find it curious that sympy seems to have so much opposition against auto-formatting, given that it's such a large, open source codebase.

How exactly does nox help here?

Nox is useful if you want to test over multiple python versions. So judging from the existing pre-PR workflow guide, tox is probably sufficient.

There's really just two things to run: bin/test quality and flake8 sympy. We could make a single script for running those two things.

Seems the requirements are more extensive than that:
https://github.com/sympy/sympy/wiki/Development-workflow#be-sure-that-all-tests-of-sympy-pass
My point is that those checks can be condensed into a single command (tox) that can be easily extended (through tox.ini). In addition, quality checks can be run through pre-commit as well, so that you (can) have faster feedback already while committing (on staged files only to boot).

smichr · 2021-10-22T19:06:46Z

sympy seems to have so much opposition against auto-formatting

I wonder black or similar recognizes an inline directive that would make it skip formatting a certain region of code. As @asmeurer noted, sometimes code formatted with mathematically sensibilities is destroyed when processed by black...but if a directive like

# black}
...literally formatted
# black{

could be used, maybe that would be something worth looking at.

redeboer · 2021-10-22T19:21:37Z

You can use fmt: off, fmt: on, and fmt: skip.
https://black.readthedocs.io/en/stable/the_black_code_style/current_style.html#code-style
In fact I use that so far for code blocks I am working on in SymPy ;)

smichr · 2021-10-24T00:31:24Z

Well, I am more in favor of it. And another place that would benefit from it is the test suite: let the formatting there be standardized. Once the test is written we don't spend a lot of time looking at it (unless we are trying to figure out why tests don't pass). And if we find that it is hard to read something there (and we really wish that object would look better, even after we copy and paste it to make it look less compact than black made it) it would suggest that a better printer is needed for the object.

smichr · 2021-10-24T09:22:30Z

Here's an example of how math gets reformatted:

     # check that we aren't off by 1
-    if (xpos/10**mag_first_dig) >= 1:
-        assert 1 <= (xpos/10**mag_first_dig) < 10
+    if (xpos / 10 ** mag_first_dig) >= 1:
+        assert 1 <= (xpos / 10 ** mag_first_dig) < 10

It's not that bad.

Here's an example of a denested calculation:

-            reps = dict(list(zip(free, [random_complex_number(a, b, c, d, ratio
nal=True)
-                           for zi in free])))
+            reps = dict(
+                list(
+                    zip(
+                        free,
+                        [
+                            random_complex_number(a, b, c, d, rational=True)
+                            for zi in free
+                        ],
+                    )
+                )
+            )

I'm not sure of the value that adds: matching brackets are highlighted by my editor, and reading left to right I can see the order of operations -- cascading it doesn't help me see anything different.

Install of black with pip was a Painless Installation Process.

ThePauliPrinciple · 2021-10-24T10:38:59Z

Is it also possible to make this automated styling optional for the contributor? That way, the people who enjoy automated styling at least can make use of it (without having to set it up themselves from scratch), while the process does not change for those who don't. Perhaps with some more specifics and testing, people can more easily come to an agreement whether this should then be enforced or not.

redeboer · 2021-10-25T19:54:23Z

The main point of formatting is that the coding style becomes more recognisable (this has subtle but large benefits to readability), that it improves diffs (code review, fewer conflicts in merge requests), and that it makes developing easier (given the right setup). All that gets broken if there are exceptions to the rule.

That said, it's not that straightforward to apply formatting to the entire framework. Some thoughts:

Best would be if some bot performs all formatting at once. That way, there is no individual who takes all the code changes as a single "contribution" to the framework.
This will still leave a lot of work: I imagine there are several places where code doesn't look as nice anymore, especially when it comes to math.
Intermediate idea: enforce pre-commit with a formatting ook over the diff of each PR (here, a bot could perform any required fixes as well btw, although it there is still the chance that contributors 'take' the formatting beforehand). That way, formatting becomes enforced gradually and you can discuss whether formatting should be reverted for certain blocks (with those # fmt comments).

As for pre-commit, see #17287 (comment)

asmeurer · 2021-10-27T04:26:32Z

It's not that bad.

But noticeably worse than what it was before (it's actually much worse in general IMO because it wants to put spaces around every operator). The point of a new tool/development process should be that it produces an improvement. I don't see the benefit of adding something if it's "only slightly worse".

Is it also possible to make this automated styling optional for the contributor?

It already is. You can style your code however you want, as long as it's reasonable, i.e., follows basic PEP 8 (although if you put spaces around *, /, and ** we will probably complain about that in review). If you want to do that via some tool that formats only the code that is modified in a specific commit (I believe such tools exist), that's your prerogative.

redeboer · 2021-10-27T08:41:27Z

I'm not sure of the value that adds: matching brackets are highlighted by my editor, and reading left to right I can see the order of operations -- cascading it doesn't help me see anything different.

When code gets formatted with a lot of indentation (due to a max line size), it's almost always an indication that there is too much going on within one line. Ideally, as a reader of the code, you shouldn't have to try to match tons of brackets. Defining some intermediate variables does the trick: it not only improves the formatting, but also the readability of the code. In other words, the formatting helps writing better code as well.

oscarbenjamin · 2021-10-28T00:16:06Z

Defining some intermediate variables does the trick: it not only improves the formatting, but also the readability of the code. In other words, the formatting helps writing better code as well.

Does black introduce intermediate variables for you? Or does it just reflow the code and make it look worse so that you feel an increased need to introduce those intermediate variables?

As I said above most formatting problems that we have come from contributors who seem not to pay any attention to formatting. For those contributors an autoformatter only helps if it actually does everything automatically.

redeboer · 2021-11-05T16:09:50Z

Does black introduce intermediate variables for you? Or does it just reflow the code and make it look worse so that you feel an increased need to introduce those intermediate variables?

The latter. This is not specific to black, it's a side effect of formatting with a certain line width. (And most modern IDEs can help you do the former for you.)

As I said above most formatting problems that we have come from contributors who seem not to pay any attention to formatting. For those contributors an autoformatter only helps if it actually does everything automatically.

Incidentally, there was a talk related to this on GitHub Universe last week:
https://www.githubuniverse.com/2021/session/693865/the-readme-project-presents-code-review-is-too-late-for-code-quality
Redundancy of style guides, preference for formatters over linters, and running your checks as early as possible (locally, or even while editing) were all mentioned there.

smichr · 2021-11-05T19:46:19Z

Could it work to undo, with a post-processor, changes that are not desired from black:

condense denesting
close up *, **, %

I am ambivalent towards quotes: I like single quotes because they appear less busy to me. black likes to use them only when necessary.

I would prefer to enforce strict line length and not allow it to continue past 72.

oscarbenjamin · 2021-11-06T00:32:40Z

One way to take this forward would be to open a demonstration PR for a substantial part of the codebase e.g. sympy/solvers that shows what the effect of applying black is. The discussion here has glossed over this but the main objection is about the problem that autoformatters can cause for mathematical expressions: #17287 (comment).

redeboer · 2021-11-06T12:09:16Z

One way to take this forward would be to open a demonstration PR

Good idea, see #22434.

redeboer · 2021-11-06T12:16:00Z

I am ambivalent towards quotes: I like single quotes because they appear less busy to me. black likes to use them only when necessary.

See ff2c81c and follow-up commit

maciejskorski · 2022-11-14T18:09:36Z

As I said above most formatting problems that we have come from contributors who seem not to pay any attention to formatting

I disagree, because the solution is not to blame junior developers, but rather to take systematic actions on clarifying rules.
To avoid the time wasted due to opinionated comments "your line is too long", I would propose to align on basic rules such as "max line length".

oscarbenjamin · 2022-11-14T18:34:26Z

the solution is not to blame junior developers

The point is not about blaming anyone. Rather I don't think that any automated tool can teach someone to pay attention to style and readability which are not things that can be expressed purely in terms of simple rules. If someone is not actively thinking about the quality of their code then requiring them to conform to the rules of a linter or to use an autoformatter does not mean that their code will become more readable.

jksuom · 2022-11-14T18:48:57Z

https://peps.python.org/pep-0008/#maximum-line-length

oscarbenjamin · 2023-03-13T11:41:01Z

I just tried black sympy and:

$ git diff --stat
...
 1344 files changed, 273853 insertions(+), 135403 deletions(-)
$ git ls-files sympy | xargs cat | wc -l
831363

So there are 700k lines of code in the sympy directory and black wants to change 150k of them adding another 150k lines in the process.

Above it was suggested that we could allow these changes to be applied gradually by only reformatting files that are edited in each PR. This can be done using pre-commit as in gh-24908 however it is a bad idea. The diff between the codebase and what black will generate is too large. Every PR would be full of formatting changes making it hard to see the substance of what is meaningfully changed. If there are going to be 400k lines of formatting diff then that needs to be reviewed separately from PRs that make actual changes to the code.

There are lots of changes in the diff that I would be okay with:

Line spacing. Currently black wants to add (9000) and remove (4000) a bunch of blank lines.
Using double quotes. I really don't care about which quote characters are used. I would be happy to have a formatter normalise these for consistency.
PEP8 style spacing like one space after a comma and spacing before a comment. (Not spacing around operators though because black at least does not handle this well.)
Expanding __all__ to one entry per line.
Putting things that fit on one line just on one line.
Normalising and expanding import statements.

These make up a good chunk of the changes that black would apply. If anyone wants to submit PRs that change some of those things then I would be happy to review them. It is important that such a PR be simple to review though which means that it must only change precisely one kind of thing even if it is applied broadly over the codebase. No one is going to review a PR that has the diff I am currently looking at after running black on the entire codebase and just running black on modules one at a time does not make it easier to review in the end. A PR that only adds and removes blank lines is easy to review though even if the diff does have thousands of lines and touches many files.

The part about what black does to mathematical expressions mostly applies to the test files but perhaps those could be excluded from autoformatting. Really though any formatter would need to be something that accepts mathematical expressions in the way that people would want to write them because there are so many in the codebase and they do need to be readable in so far as possible. The main problem is black's insistence on spaces around *, / and **/ Standard mathematical spacing goes around + and relational operators like =:

p = a*x**3 + b*x**2 + c*x + d

Black would previously change that to:

p = a * x ** 3 + b * x ** 2 + c * x + d

I complained about this in a black issue (psf/black#538 (comment)) that is now considered to be fixed because now black gives:

p = a * x**3 + b * x**2 + c * x + d

This is better then before but still not ideal because spacing around * is often bad for readability.

I would like to enforce something around line length in the codebase but I don't like how black reflows long lines so I don't see it as a good solution. The example highlighted by @smichr is a good one:

         if free:
             from sympy.core.random import random_complex_number
+
             a, c, b, d = re_min, re_max, im_min, im_max
-            reps = dict(list(zip(free, [random_complex_number(a, b, c, d, rational=True)
-                           for zi in free])))
+            a = 2
+            reps = dict(
+                list(
+                    zip(
+                        free,
+                        [
+                            random_complex_number(a, b, c, d, rational=True)
+                            for zi in free
+                        ],
+                    )
+                )
+            )
             try:
                 nmag = abs(self.evalf(2, subs=reps))
             except (ValueError, TypeError):

Maybe this sort of thing won't come up very often but it shows how the output of black cannot be simply accepted without argument. The proper reformat here just needs to introduce a new variable but black won't do that:

        nums = [random_complex_number(a, b, c, d, rational=True) for zi in free]
        reps = dict(list(zip(free, nums)))

Actually the call to list is unneeded and anyway it would be clearer to have a function for the expression and use a dict comprehension:

        randnum = lambda: random_complex_number(a, b, c, d, rational=True) 
        reps = {sym: randnum() for sym in free}

In fact black will also accept this being one line:

        reps = {sym: random_complex_number(a, b, c, d, rational=True) for sym in free}

These forms make it clearer what reps actually is. Maybe black or something else can help with cases like this in the sense that it prompts a contributor to rewrite the code in a better way (by first rewriting it horribly). I expect that if it is applied automatically though then many beginner contributors will have PRs that have some very strangely formatted things in them.

In any case if anyone wants to advance the idea of using an autoformatter then what is needed is for there to be a formatter that does not immedaitely apply a massive diff so the distance between the codebase and what a formatter would accept needs to be reduced. Getting to the point of being able to run flake8 on the codebase required many PRs changing thousands upon thousands of lines and the same is true here.

MartinThoma · 2023-03-13T13:07:03Z

There is a new kid on the block: https://github.com/charliermarsh/ruff

ruff contains a lot of flake8 and flake8 plugin rules: https://beta.ruff.rs/docs/rules/

If possible, they contain auto-fixes (can be applied with --fix / indicated in the list with the 🛠 icon).

I haven't checked, but maybe ruff contains some rules that cover black changes.

oscarbenjamin · 2023-03-13T13:27:33Z

There is another issue discussing usage of ruff: gh-24508

asmeurer · 2023-03-13T19:59:17Z

I would like to enforce something around line length in the codebase but I don't like how black reflows long lines so I don't see it as a good solution. The example highlighted by @smichr is a good one:

In my experience with seeing other codebases formatted with black, it does this a lot. If something doesn't fit on one line, it splits into 12 lines. The most common thing you see it with is function definitions, where you regularly get things like

def f(
    x,
    y,
    ...
):

especially when there are type annotations. The __all__ thing is another example of this. Maybe in that case it actually makes sense to have one per line, although honestly as long as we have flake8 checking the contents of __all__ it doesn't really matter how it is formatted.

@MartinThoma may have a point with ruff. If we can fix individual things we like with separate ruff fixes, that may be better than applying black wholesale. And an automated fixer is less likely to make a mistake in the process.

There are lots of changes in the diff that I would be okay with:

My issue is that most of these things don't really matter. Why is it even worth bothering to normalize quote characters or the number of blank lines? It doesn't really matter if sometimes a blank line is used between functions and sometimes two blank lines, or sometimes a string uses single quotes and sometimes double quotes. Those kinds of things are so trivial that they don't even affect readability.

I still maintain that this whole discussion just feels like a complete waste of time.

oscarbenjamin · 2023-03-13T21:32:47Z

I generally dislike the way that black/yapf reflow long lines with things like list comprehensions or nested data structures but actually this one I find okay:

def f(
    x,
    y,
    ...
):

The example seems ridiculous but that's because if it really was just f(x, y) then it wouldn't be rewritten like this. For this function signature to be reflowed the function must take a lot of parameters and really it is easier to read if those are all on different lines especially when there are defaults and type annotations.

as we have flake8 checking the contents of __all__ it doesn't really matter how it is formatted

Certain things like __all__ and imports are better split out onto separate lines to reduce merge conflicts and make diffs easier to understand.

It doesn't really matter if sometimes a blank line is used between functions and sometimes two blank lines, or sometimes a string uses single quotes and sometimes double quotes. Those kinds of things are so trivial that they don't even affect readability.

The premise of using autoformatters is not necessarily that there is a best way to do things but precisely the fact that many of these things usually don't matter so they should just be normalised automatically without discussion leaving everything in a consistent style. Autoformatting in particular with pre-commit makes it easy for someone to move between different codebases and not need to learn any new formatting rules.

My issue is that most of these things don't really matter. Why is it even worth bothering to normalize quote characters or the number of blank lines?

The value is in getting to the point where you can use an autoformatter so that formatting is then to a large extent automatic. That is useful because aside of consistency it is then easy for everyone to understand formatting rules and to have them applied without manual tweaking or responding to review comments.

oscargus added the Code quality label Jul 28, 2019

moorepants closed this as completed Jul 28, 2019

moorepants reopened this Apr 26, 2020

MartinThoma mentioned this issue Apr 28, 2020

STY: Remove trailing whitespace #19214

Merged

iammosespaulr mentioned this issue May 3, 2020

STY: Remove trailing whitespace #19249

Merged

redeboer mentioned this issue Oct 13, 2021

Added pre-commit config file #22266

Closed

smichr mentioned this issue Nov 6, 2021

black demonstration #22432

Closed

redeboer mentioned this issue Nov 6, 2021

[Demo PR] Apply black formatting to sympy/solvers #22434

Closed

oscarbenjamin mentioned this issue Nov 6, 2021

Allow no space between operator ** psf/black#538

Closed

redeboer mentioned this issue Nov 6, 2021

[Demo PR] Format sympy/solvers with yapf #22437

Closed

oscarbenjamin mentioned this issue Nov 14, 2022

enforce lines formatting #24261

Closed

oscarbenjamin mentioned this issue Mar 13, 2023

Pre commit pipeline #24908

Open

oscarbenjamin mentioned this issue Mar 18, 2023

fast series reversion #24253

Open

Unified coding (spacing) style for sympy. #17287

Unified coding (spacing) style for sympy. #17287

Comments

stevenleeS0ht commented Jul 28, 2019 • edited

oscargus commented Jul 28, 2019

moorepants commented Jul 28, 2019

stevenleeS0ht commented Jul 29, 2019 • edited

moorepants commented Jul 29, 2019

oscargus commented Jul 29, 2019

MartinThoma commented Apr 26, 2020

moorepants commented Apr 26, 2020 • edited

moorepants commented Apr 26, 2020

moorepants commented Apr 26, 2020

MartinThoma commented Apr 26, 2020

moorepants commented Apr 26, 2020

oscarbenjamin commented Apr 26, 2020

smichr commented Apr 26, 2020 • edited

MartinThoma commented Apr 28, 2020

asmeurer commented Apr 28, 2020

oscarbenjamin commented Apr 28, 2020

redeboer commented Oct 13, 2021 • edited

oscarbenjamin commented Oct 13, 2021

asmeurer commented Oct 14, 2021 • edited

redeboer commented Oct 14, 2021

oscarbenjamin commented Oct 14, 2021 • edited

oscarbenjamin commented Oct 14, 2021

redeboer commented Oct 14, 2021

smichr commented Oct 22, 2021

redeboer commented Oct 22, 2021

smichr commented Oct 24, 2021

smichr commented Oct 24, 2021

ThePauliPrinciple commented Oct 24, 2021

redeboer commented Oct 25, 2021 • edited

asmeurer commented Oct 27, 2021

redeboer commented Oct 27, 2021

oscarbenjamin commented Oct 28, 2021

redeboer commented Nov 5, 2021

smichr commented Nov 5, 2021

oscarbenjamin commented Nov 6, 2021

redeboer commented Nov 6, 2021

redeboer commented Nov 6, 2021 • edited

maciejskorski commented Nov 14, 2022 • edited

oscarbenjamin commented Nov 14, 2022

jksuom commented Nov 14, 2022

oscarbenjamin commented Mar 13, 2023

MartinThoma commented Mar 13, 2023

oscarbenjamin commented Mar 13, 2023

asmeurer commented Mar 13, 2023

oscarbenjamin commented Mar 13, 2023

stevenleeS0ht commented Jul 28, 2019 •

edited

stevenleeS0ht commented Jul 29, 2019 •

edited

moorepants commented Apr 26, 2020 •

edited

smichr commented Apr 26, 2020 •

edited

redeboer commented Oct 13, 2021 •

edited

asmeurer commented Oct 14, 2021 •

edited

oscarbenjamin commented Oct 14, 2021 •

edited

redeboer commented Oct 25, 2021 •

edited

redeboer commented Nov 6, 2021 •

edited

maciejskorski commented Nov 14, 2022 •

edited