Draft OEP-33: Code Autoformatting #107

cpennington · 2019-03-04T20:37:24Z

No description provided.

UmanShahzad

Is this just to figure out the tools, or the important configuration pieces of the tools as well? Will we use their defaults? I assume this should be a centralized decision, otherwise how does each repo choose its specific config?

oeps/oep-0031-bp-code-autoformatting.rst

cpennington · 2019-03-05T14:51:11Z

Both of the suggested tools are pretty opinionated already, but if there are decisions to be made, we should standardize on them. I'll add this to the OEP, but any required configuration should be put into the edx-lint repo in order to centralize those decisions.

oeps/oep-0033-bp-code-autoformatting.rst

Co-Authored-By: cpennington <calen.pennington@gmail.com>

OmarIthawi · 2019-03-11T08:49:01Z

@cpennington thanks for sharing this OEP. I'm a huge fan of consistency and autoformatting. black seems like a good choice, and yes, we will save on review time.

However, this will probably hit me hard as a community member. I constantly spend good chunk of time viewing $ git blame on the Open edX project when debugging an issue. Such repo-wide changes will make my life a bit harder because of the sheer amount of the one-time git noise the reformatter produce.

I'd like to know if this is a final decision or not.

cc: @nedbat

@felipemontoya @sambapete what do you think?

cpennington · 2019-03-11T13:12:27Z

Yeah, that's a definite concern. I did a little looking, and there's a tool called git hyper-blame that professes to solve that problem. I'll put a link into the OEP, and a recommendation to add a reference to the commit that bulk-converts the repo into the hyper-blame ignore file.

OmarIthawi · 2019-03-11T13:53:30Z

Thanks @cpennington for pointing out to $ git hyper-blame. Support in GitHub and other tools, is naturally missing as it's a relatively advanced git tool.

But that's a lot better than not having it, so a link would be great.

cpennington · 2019-03-11T14:04:20Z

For github, when you're looking at the blame view, there's a relatively innocuous button that lets you "view blame prior to this change".

OmarIthawi · 2019-03-11T14:18:07Z

Yeah @cpennington I use that extensively, sometimes on a daily basis as a edX Platform fork maintainer.

Now I have to click twice to get what I want, but that's not a huge problem.

My guess that most engineers rely on manual debugging and eyeballing the code to detect and fix issues. I'm a bit on the other extreme of heavily relying on all of $ git blame (and sometimes PyCharm annotate), $ git bisect and of course automated tests to when debugging.

I wanted to add that, so you know why this change specifically affects my work a bit.

That's being said, we have $ git hyper-blame, I think it's a good change and we should go with it.

stvstnfrd · 2019-03-11T20:52:31Z

However, this will probably hit me hard as a community member.

I really can't underscore this enough...

What's the timeline for accepting this proposal?
I'll follow up with a more substantive reply in the next 24 hours :)

cpennington · 2019-03-12T00:36:55Z

The proposed deadline for resolution on the OEP (either accepting or withdrawing it) is April 1 (but no, this is not an April Fools joke). If there's still controversy swirling at that point, we can always extend debate.

cpennington · 2019-03-12T00:39:02Z

I'm surprised to hear that git blame is such an integral part of your everyday experience, to be honest. I wouldn't have expected the question of "who changed this last" to matter quite as much as you're suggesting it did (certainly not enough to outweigh having to wade through stylistic nitpicks).

bradenmacdonald · 2019-03-12T02:19:27Z

I wouldn't have expected the question of "who changed this last" to matter quite as much as you're suggesting it did

It's also an important tool for me (though I mostly use it via GitHub's UI). The term blame is kind of a joke, I think; I'm rarely asking the question "who changed this last." Instead, I use blame when code is confusing/under-commented/under-documented and I need to know what it's doing, or why. In that case, I'm asking "What commit introduced this mysterious line(s) of code, and what pull request was it part of?" The hope is that the commit message, pull request, and/or Jira tickets related to it will shed light.

cpennington · 2019-03-12T13:37:52Z

Ah, fair enough.

I've been trying to think of a way to have our cake and eat it too, but I can't really see a way to get autoformatting and at the same time avoid some level of bulk commit. For me, the trade-off in the direction of autoformatting seems clear, but it sounds as though you folks have a different experience.

OmarIthawi · 2019-03-12T13:52:54Z

@bradenmacdonald thanks for the detailed explanation, I pretty much need blame for the same purpose you do. Certainly the blame is a joke, but a funny one I think!

So it looks there's a shared work pattern between fork maintainers.

nedbat · 2019-03-12T15:31:38Z

oeps/oep-0033-bp-code-autoformatting.rst

+Consequences
+------------
+
+There will be commits to all OpenEdX repositories that cause large-scale


We should know how to spell Open edX by now :)

nedbat · 2019-03-12T15:33:36Z

oeps/oep-0033-bp-code-autoformatting.rst

+history. Users with auto-formatters configured in their editors working
+in a repo that is not yet OEP-33 compliant will need to disable their
+autoformatter, or risk large-scale changes that are unrelated to their
+current tasks. All OpenEdX code will be formatted consistently and


Do editors generally have a way to configure auto-formatting on a per-repo basis? Is that something we can configure centrally? This will help with gradual adoption of auto-formatting across Open edX. Perhaps more importantly, many developers work on non-Open-edX projects also, so universal auto-formatting is not reasonable.

I'm not sure about all editors, but VSCode has per-workspace (aka per-repo) settings that can disable autoformatting (or, conversely, could be used to enable autoformatting only on particular repos).

Many of my concerns can be alleviated if things were to be rolled out on a per-repo basis.

nedbat · 2019-03-12T15:35:48Z

My concern here is that it will add one more barrier to contribution and velocity.

Is the problem being solved here worth the cost of an even pickier linter for formatting? Can we do something to adopt auto-formatting without using it as a gate to contribution? For example, encourage people to use the formatter, but not use the formatter as a linter.

Can any of these formatters be run on a pull request to add a commit that formats the changed lines? That would give us the advantage of the formatting without forcing another cycle on a pull request.

cpennington · 2019-03-12T15:46:31Z

I'm not aware of any automatic way of running them on a PR, but in a repo where they have been set up, it would be as simple as: check out the repo, run make format, and commit the results.

cpennington · 2019-03-12T15:52:08Z

Is the problem being solved here worth the cost of an even pickier linter for formatting?

I think one of the main benefits of this, to me, is that it actually frees the developer from needing to know about picky formatting rules. We get both a) consistent code and b) more dev headspace, because we can say: The tool formats it the right way, as long as you've run the tool, your code is good to go.

cpennington · 2019-03-12T16:03:38Z

For example, encourage people to use the formatter, but not use the formatter as a linter.

One problem with that is that it makes @OmarIthawi, @stvstnfrd, and @bradenmacdonald's problem w/ the linter much more pronounced, because it makes the possibility of accidental drive-by-formatting fairly likely. I can't (right now) have black running in my editor when using edx-platform, because it will make any saved file significantly different, and will obscure my actual changes. Unless we enforce that all new code be autoformatted, we can't really use the autoformatter without drivebys.

davidjoy · 2019-03-12T17:17:27Z

(Update: think I'm reiterating a question from above.)

Just to clarify, is the intention of this OEP merely to prescribe which formatters should be used, but not which linting/formatting rules we want to enforce? Is settling on that list a separate effort?

Anecdotally, for JavaScript, I believe many of Prettier's default rules conflict with our current eslint-airbnb based linting that's used here and there. I'm assuming/hoping we'll adopt a set of rules (elsewhere) that cause as little churn in our existing codebase as we can manage.

(This is not intended as an invitation to have that highly subjective conversation! I'm really hoping this is just about tools and not rules.)

cpennington · 2019-03-12T18:44:03Z

I would like to standardize on tools, and ideally on tools that are as opinionated as possible. If there are specific rules that need to be configured, the configuration should be put into the edx/edx-lint repo (which I think I specified in the OEP).

stvstnfrd · 2019-04-01T20:00:23Z

@cpennington Sorry I haven't circled back here; things have been hectic with the conference :)

If you don't mind, I'd still like to weigh in here, I'll start typing up some follow-up thoughts.

cpennington · 2019-04-02T18:36:39Z

@stvstnfrd, yup, happy to wait for your feedback.

cpennington · 2019-04-08T14:34:41Z

@stvstnfrd any progress on getting your follow-up thoughts in a readable form?

cpennington · 2019-04-09T13:43:55Z

@davidjoy: Does https://github.com/paulolramos/eslint-prettier-airbnb-react seem like the right thing to get prettier, eslint, and the airbnb style guide to play nicely together? I'm also ok w/ leaving that specific combination unspecified for now until we have a repo or two that we're willing to try it on. (Or to try it on a repo or two as a local decision before making it global)

nedbat · 2019-04-09T16:59:41Z

I guess I remain unconvinced that consistent formatting is a serious issue for us. Many of us are working hard to remove barriers to contribution, to ease the on-ramp, and to simplify the process of getting code merged. This will make it harder.

cpennington · 2019-04-09T17:44:34Z

@nedbat: Would your concerns be satisfied/mollified if we had tooling that automatically applied the autoformatters if there were any changes that were needed (and then either suggested it as a commit, or added it as a commit to the PR?)

To be clear, if that's the bar that we'd need to reach, I'm not sure if it would be worth the effort or not, but I'm trying to understand what the level of effort is.

cpennington · 2019-04-09T18:34:31Z

I talked with @estute, and he suggested that a similar jenkins strategy to that being used for automating make upgrade could be used for automating make format on PRs as well.

nasthagiri · 2019-04-10T00:32:57Z

oeps/oep-0033-bp-code-autoformatting.rst

+
+In short, more consistent, less ambiguous standards for code formatting
+make for more productive code reviews and waste less time on inconsequential
+details.


Have there been recent incidents where inconsistent formatting has resulted in longer PR reviews? I have not found this to be a major issue in my own personal experience to-date working on our platform.

Personally, I run into pylint violations that are simply a question of spacing/indentation fairly frequently. I also often refrain from commenting on formatting that would make the code more consistent because it's not something that I want to have another review cycle on. Maybe it's just me that's bothered by inconsistent code formatting?

Ah, right. I've also run into pylint violations, complaining of extra spaces that it doesn't allow. We could also just update our linter to ignore that violation if it's not necessary to enforce.

My view is that there is value in consistency and in the spacing guidelines that pylint suggests, from a readability perspective, and that there isn't much value in making engineers actually enforce it themselves, since tooling can do that job faster and more effectively.

Thanks @nasthagiri, great question. Yes, formatting and CI in general has been a painpoint for me while contributing to Open edX.

Despite its great value, it still puts a lot of unrelated issues to the code that I write -- say I edited line no. 1, it would complain about line no. 100. This is especially true for home-grown linters such as the XSS linter.

I'm not saying that we should loosen and "de-regulate" Open edX quality, but at the same time please don't underestimate the effort that infrequent contributors have to go through.

This is especially true for the edx-platform repository.

I have learned my way through the process, but I find it hard to recruit other open source contributors whether from my company or from the community.

I have a separate OEP (#108) that I want to use to cut off the accidental flagging from the XSS Linter. In general, I'm all for making it so that the only things that actually get flagged on a PR are substantive issues in code that you actually wrote in the PR.

Thanks @cpennington! That would be great. Afterwards, we'd only be left with the inevitable cost of dealing with the CI on such a monolithic piece of code -- the edx-platform. As a long time (sine 2014) but infrequent contributor, I do see the value of the CI, but also don't want to undermine the cost of it to me as a contributor.

Therefore, I think we should look at such initiatives a bit more critically that we're doing now -- which is mostly focusing on the pro's of adding yet another step.

Yes, the impact on time-to-feedback for edx-platform specifically is a good question.

On my machine (not devstack), w/ no black cache, it takes 10 minutes to format all of edx-platform. After reformatting everything, it takes 13s to reformat/check again.

After I reformat everything, and then delete the cache, a full format takes 5 minutes (and again, once cached, takes about 13s).

So, that suggests that a) we want to make sure jenkins keeps that cache around, and b) that devstacks are pre-built with the cache so that running black inside devstack to do a full-format (for someone w/o editor integration) is fast enough to not be a blocker.

Sigh... turns out that black needs python 3.6 to run, which we don't have installed on devstack... So, we'd need some more investment to make edx-platform work.

I also often refrain from commenting on formatting that would make the code more consistent because it's not something that I want to have another review cycle on. Maybe it's just me that's bothered by inconsistent code formatting?

Internally at Stanford, I enforce this fairly rigidly, through sheer force of will :)
That doesn't always scale though...

cpennington · 2019-04-10T12:18:31Z

@nedbat (and others): I'd like to propose making OEP provisional until we've had a chance to try it out on a few repos to see how well it works. In particular, I'd like to start with XBlock and ecommerce (both of which are low-but-not-zero traffic), and also perhaps @davidjoy could suggest a front-end repo to test out prettier integration on that is in a similar vein? Then, based on the experiences in those repos, we can decide on whether to roll this out more broadly?

nasthagiri · 2019-04-10T12:31:30Z

@cpennington I like the idea of keeping this Provisional and initially testing in smaller/less-busy repos. Although it's still unclear whether there's a systemic problem worth solving here.

Including front-end repos may not be necessary at this time, since this has not come up in any recent FEDx meetings that I've attended. We have larger problems around dependency management that may be more worthwhile to focus on.

cpennington · 2019-04-10T15:58:12Z

I've actually heard from the FEDx group that they discussed prettier in the past, but that there were some folks that had reservations because they wanted to learn what proper formatting was, rather than using a tool for that.

stvstnfrd · 2019-04-11T07:58:35Z

oeps/oep-0033-bp-code-autoformatting.rst

+the code. As such, linters that can be run locally should be preferred.
+Similarly, ambiguity in style guidelines should be minimized, so that there
+are fewer places for commentary around what amounts to personal preference
+(which might then instigate unnecessary debate).


stvstnfrd · 2019-04-11T08:06:23Z

oeps/oep-0033-bp-code-autoformatting.rst

+
+The current recommended formatters are:
+
+    - Python: `black`_


Last I looked into it, Black was too opinionated and barely, if at all configurable.
I found it too rigid. This would seem to maximize the amount of the codebase to be re-written,
which I'd like to avoid, per reasoning elsewhere in thread..

So, opinionated is actually the goal, here. Whatever formatter is chosen should format any given code construct identically no matter how it's originally input. I think the pain of the rewrite is going to be there no matter how big or small it is (there's gonna be one big PR against the repo either way).

So, opinionated is actually the goal, here.

My hangup here isn't being opinionated;
it's being "poorly" opinionated.

I can rerun my test cases again if you'd like the specific violations,
but as mentioned, when I tried this on our code, black mandated changes that
I considered "wrong".

There where specific style patterns that we chose intentionally to make our fork more manageable, but we were no longer able to use these patterns under black
That's why I want flexibility/configuration; so we can forgo any overbearing rules.
Otherwise, we're getting hit on both the front and back ends;

the upfront pain of the initial translation

the ongoing cost of not being able to write code as we see fit for our fork

I'm not suggesting flexibility for the sake of endless and ongoing style debates
or a needlessly iterative process, but I don't want to be backed into a corner.
I want options if needed.

In RFC terms, think
MAY reconfigure SOME options
not
WILL reconfigure MOST/ALL options

Whatever formatter is chosen should format any given code construct identically no matter how it's originally input.

Agreed.

I think the pain of the rewrite is going to be there no matter how big or small it is

It's really a matter of perspective...
For edx, yeah, it's going to a pain so a single shot makes it easier.
Downstream, consider the following:

most side repos will be fairly painless for us;
platform is my major concern re:transition

So if you enable it for new directories, there's no pain downstream (or upstream, for that matter)

Now when it comes to running it against existing platform code, consider:

single shot: for you

run the script

commit the code

move on

single shot: for us

merge the code

spend weeks debugging/resolving conflicts because every code file in the platform has been modified

hope we got it all

run the script against our code now

commit the code

move on... to significant regression testing because we didn't just run an autoformating tool, we ran an autoformating tool, reworked virtually all of our custom integrations in every code file, then ran an autoformating tool again. It'll be more than a chore.

versus

staged roll out: for you

run the script against a subset of additional directories

commit the code

repeat each named release until complete

staged roll out: for us

merge the code

spend a much smaller amount of time resolving format-conflicts in the handful of newly-formatted subdirs

feel more confident that we got everything in the now smaller surface area

defer running the script against our code now, so that we can deal w/ it out-of-band of this upgrade

spend significantly less time debugging and regression testing since the whole platform didn't change out from us over night, which allows us to get the named release live and not fall further behind, allowing us to

repeat each named release until complete

I concede, single shot is easiest for you, it's one run of an automated tool.
Just consider how much more it ends up being for anyone with non-trivial platform modifications.
While a staged roll out would yield fewer commits, I'm not sure that's a selling point for us.
One commit Auto format isn't a practical difference for us versus multiple Auto format 1/N.

And a staged roll out let's us absorb those hits over time, ideally over the course of named releases. When there have been sweeping changes to the edx-platform codebase in the past,
having to "eat it" all at once basically drops our integration velocity down to zero for an extended period of time.

My hangup here isn't being opinionated;
it's being "poorly" opinionated.

Yeah, that's fair enough. Like I said, I hadn't run into places where I objected w/ Black's formatting, so it would be helpful to me (while making a recommendation in this OEP) to see your specific counterexamples.

Just consider how much more it ends up being for anyone with non-trivial platform modifications.
While a staged roll out would yield fewer commits, I'm not sure that's a selling point for us.
One commit Auto format isn't a practical difference for us versus multiple Auto format 1/N.

And a staged roll out let's us absorb those hits over time, ideally over the course of named releases. When there have been sweeping changes to the edx-platform codebase in the past,
having to "eat it" all at once basically drops our integration velocity down to zero for an extended period of time.

I'm definitely willing to defer to your experience on this. It shouldn't be too much trouble to set up our integration tooling in a way that we can mark specific directories as having been blackened, and slowly expand that coverage. It's a bit cumbersome w/ editor auto-integration, but perhaps worth it. What pace of conversions is something that you think you and other integrators could manage? You mentioned being over the course of multiple open-edx releases, with puts the timeframe in months or years, which is a bit unfortunate from my POV.

stvstnfrd · 2019-04-11T08:15:18Z

oeps/oep-0033-bp-code-autoformatting.rst

+This will allow the `git hyper-blame`_ tool to ignore them by default,
+to preserve the easy ability to see who changed particular file lines.
+
+.. _git hyper-blame: https://commondatastorage.googleapis.com/chrome-infra-docs/flat/depot_tools/docs/html/git-hyper-blame.html


While this seems interesting, I don't think it helps for many use cases.
Does Github suport it?
I know that my text editor integration (vim-fugitive) explicitly doesn't support it now [1].
I'm worried the impact here would be small.

[1] Feature request: git hyper-blame tpope/vim-fugitive#1058

Yeah, Github doesn't support it. I'm willing to include it if people want it (or even if it would be marginally useful), because it isn't a big lift to add (but I agree, it's also doesn't solve a lot of use-cases).

Yeah, I guess I'm not saying don't mention it, as much as I wanted to be clear that it shouldn't be interpreted as end-all solution.

stvstnfrd · 2019-04-11T08:27:28Z

oeps/oep-0033-bp-code-autoformatting.rst

+history. Users with auto-formatters configured in their editors working
+in a repo that is not yet OEP-33 compliant will need to disable their
+autoformatter, or risk large-scale changes that are unrelated to their
+current tasks. All OpenEdX code will be formatted consistently and


Many of my concerns can be alleviated if things were to be rolled out on a per-repo basis.

stvstnfrd

@cpennington Thanks for waiting :)

tl;dr: need more info

I'd like to see a little more regarding choice of tools, pros/cons,
possible vs proposed impact to codebase, etc.

Background, Stanford

Consistency is king.

I should start by saying,
I'm a huge proponent of consistently formatted code.
Internally, we follow a fairly rigid style.
So I want to emphasize, I support the general idea here;
my concerns lie largely in planning/implementation.

Beyond the normally-touted benefits (ease of readability, onboarding,
etc.), there are a few stylistic decisions that, when enforced
consistently, make our (read: Stanford and/or the community) lives all
the easier (eg: always trailing commas and never singleline docstrings;
these lead to an easier to manage diff).

A desire to autoformat code is one of the motivations behind moving our
custom code into the openedx/stanford namespace [1]; the idea is/was
to run our own, more aggressive/opinionated, autolinter against this
subdirectory. So I started poking around the space, but hadn't yet
implemented anything on our end. Anyway...

Concerns, merge conflicts

Most immediately, I'm concerned with code "churn" and what that will
mean for conflict resolution on upcoming merge(s). If the number of
modified files and/or the number of updated lines per file increase,
this increases the surface area for potential conflicts. Even though our
merges (of named releases) have been getting progressively easier, I
would expect this to be one of our worst to date :\

Pros:

one time cost(?)

Cons:

a very heavy cost, downstream

Concerns, commit history

There's also the issue of lost history/blame.
Yes, I know how to trace back through multiple levels of commit history;
we do it often enough. But it's a pain. And some things are already
multiple levels deep; burying many more another layer deep just seems
frustrating.

And just to point out, we use git blame (and log) rather extensively.
Especially living downstream on a fork, it's often the only insight we
have into changes. Still too often, referenced JIRA tickets are private,
or little public info is available. So having quick access to
history really goes a long way..

This compounds for us during merges.. The history helps balance our
changes with "yours". So then when we merge, we're going to have more
conflicts, which means more "questions" running blame, which requires
an extra blame now... Turtles, all the way down...

Concerns, onboarding

I agree with @nedbat's concerns re:onboarding.
Whatever we do, it should be easy, if not obvious for people.

Would progressively stricter linting requirements cover this?
Then the existing quality tools (diff-quality, etc.) could take effect
and just enforce higher standards.
Even if you didn't roll out automatic formatting, if the Jenkins quality
build fails ("inconsistent quote usage" or whatever), it ought to be
obvious how to format the new code.

Aside: I've not done much w/ git-hooks; are those auto-magic enough?

Concerns, implementation impact

Given the described potential impact (conflicts, history), I'd like to
get a sense of what the impact from the various tools would be,
eg: would black modify 3x as many lines as yapf, etc.
This could weigh into tool decision.

Also, what are the differences in configuration options between the
tools? If there are few options, we risk a single, massive rewrite. If
there are many options, we have the option of going live with a
minimally invasive change. Optionally, this configuration could grow
more strict over time.

Spoiler: I'd like to see us go w/ the most configurable (re:rules) tools.

Specifically, I remember the last I looked into this, the issue wasn't
just that black significantly rewrote things, but that it had undone
intentional changes we had made to improve maintainabilty on our fork :\

Proposals

Whatever decisions are made (linter, rules, etc.) roll it out
progressively, strict for new repos, then start with smaller repos and
work your way in.
Leave edx-platform for last.

When you get to platform, consider enabling it only for new apps/dirs to
start. Hopefully more and more of the existing code will get moved out
of the core anyway...

But I would like to see us use a configurable tool and further discuss
which rules to enforce or not :)

Happily tag me on any WIP implementations;
I'll be faster next time ;)

[1] https://github.com/Stanford-Online/edx-platform/tree/master/openedx/stanford

cpennington · 2019-04-11T14:05:11Z

Concerns, merge conflicts

Most immediately, I'm concerned with code "churn" and what that will mean for conflict resolution on upcoming merge(s). If the number of modified files and/or the number of updated lines per file increase, this increases the surface area for potential conflicts. Even though our merges (of named releases) have been getting progressively easier, I would expect this to be one of our worst to date :\

Yeah, I agree, inflight code is going to be tricky. I think there's probably a strategy out there for larger changes that are amenable to rebasing that goes something like:

Rebase right before the reformat commit, resolving all actual code conflicts
Rebase on top of the reformat commit, taking the rebased code as gospel, and then re-run the formatter.
Rebase onto the tip of master, resolving all actual code conflicts.

Small changes can probably just be rebased onto the tip of master. Large changes that aren't rebasable are likely to be a bit of a one-time headache, though, I agree.

Concerns, commit history

There's also the issue of lost history/blame. Yes, I know how to trace back through multiple levels of commit history; we do it often enough. But it's a pain. And some things are already multiple levels deep; burying many more another layer deep just seems frustrating.

I think this is one argument for doing as much of the changes in a big-bang as possible. The fewer repo-spanning commits we have, the cleaner the log is going to be. If we are incrementally adding more rules, or incrementally covering more subfolders, than git-log is going to have a commit for each and every one of those formatting commits, instead of a single one-and-done.

Concerns, onboarding

I agree with @nedbat's concerns re:onboarding. Whatever we do, it should be easy, if not obvious for people.

Yes, I 100% agree. One of my goals for this is to make it as painless as possible for a new person (so, the autoformatter should be able to run out-of-the-box on devstack, and ideally, we'd be able to automate tooling so that OSPRs (or possibly all PRs) are offered a commit that just autoformats anything that's incorrect.

Would progressively stricter linting requirements cover this? Then the existing quality tools (diff-quality, etc.) could take effect and just enforce higher standards. Even if you didn't roll out automatic formatting, if the Jenkins quality build fails ("inconsistent quote usage" or whatever), it ought to be obvious how to format the new code.

I really want to reduce/eliminate linting for changes that can be automatically applied. From our experience w/ edx-platform, the gradual-reduction strategy really just doesn't seem to work. (I'm actually proposing #108 on a separate front to attempt to make the linting process less bad). As I mentioned elsewhere in this PR, I actually really hate having linters that tell me about minor consistency issues, given that we could be using a tool that simply does the right thing automatically.

Aside: I've not done much w/ git-hooks; are those auto-magic enough?

I'd love to have a pre-commit hook that does this fix (Black has recommendations on how to do that: https://black.readthedocs.io/en/stable/version_control_integration.html). However, there's no way for a git repo to force a user to install commit hooks (for a good reason), so we'd be adding a separate hurdle for new committers. I'm open to the suggestion that we should at least set up the codebase for it, though, to have another avenue for "don't make me think".

Concerns, implementation impact

Given the described potential impact (conflicts, history), I'd like to get a sense of what the impact from the various tools would be, eg: would black modify 3x as many lines as yapf, etc. This could weigh into tool decision.

Also, what are the differences in configuration options between the tools? If there are few options, we risk a single, massive rewrite. If there are many options, we have the option of going live with a minimally invasive change. Optionally, this configuration could grow more strict over time.

As I've mentioned elsewhere, I a) don't want to have options so that we don't have to have arguments about the options, and b) want to have a big-bang per repo as much as possible in order to minimize history issues. I think any autoformatting strategy that is gradual will just produce more overall pain because it will produce multiple repo-spanning commits, which all of their inherent problems.

Specifically, I remember the last I looked into this, the issue wasn't just that black significantly rewrote things, but that it had undone intentional changes we had made to improve maintainabilty on our fork :\

Do you have any examples there? I've been pretty happy w/ the way that black lays out code, so I'd love to hear about possible gotchas.

Proposals

Whatever decisions are made (linter, rules, etc.) roll it out progressively, strict for new repos, then start with smaller repos and work your way in. Leave edx-platform for last.

I agree w/ an incremental rollout strategy, as long as we're rolling out to each repo in one fell swoop.

When you get to platform, consider enabling it only for new apps/dirs to start. Hopefully more and more of the existing code will get moved out of the core anyway...

I think I mentioned elsewhere that this is going to add more commits w/ possible merge implications, so I'd rather do a global change.

cpennington · 2019-04-24T18:30:33Z

One problem that I ran into as I tried to run black on xblock: psf/black#195. In particular, pylint stay at the end of a block, rather than staying with the beginning of a block when the block is split onto multiple lines.

jmbowman · 2019-05-28T15:19:32Z

oeps/oep-0033-bp-code-autoformatting.rst

+    [testenv:format]
+    basepython =
+        python3.6
+    deps=black


From an OEP-18 perspective, this should use a requirements file generated by pip-compile instead (so we get version consistency and regular updates). I realize this is just an example, but I don't want to encourage people to copy this pattern verbatim.

tuchfarber · 2021-01-13T17:14:44Z

Though this is two years old, I wanted to note that we've added automatic code formatting (via black) for the edx/credentials repo.

stvstnfrd · 2021-09-10T19:32:06Z

Given the age, I'm inclined to close this for now (but will leave the branch).
Re-open, if needed (but hopefully not 😉 )

UmanShahzad reviewed Mar 5, 2019

View reviewed changes

oeps/oep-0031-bp-code-autoformatting.rst Outdated Show resolved Hide resolved

Draft OEP-33: Code Autoformatting

ce5df36

cpennington force-pushed the autoformatting branch from 97a8676 to ce5df36 Compare March 5, 2019 14:49

cpennington changed the title ~~Draft OEP-31: Code Autoformatting~~ Draft OEP-33: Code Autoformatting Mar 5, 2019

cpennington added 4 commits March 5, 2019 09:58

fixup! Draft OEP-33: Code Autoformatting

36ecb2e

fixup! Draft OEP-33: Code Autoformatting

7233c05

fixup! Draft OEP-33: Code Autoformatting

2aeee40

fixup! Draft OEP-33: Code Autoformatting

2815c26

matthugs reviewed Mar 6, 2019

View reviewed changes

oeps/oep-0033-bp-code-autoformatting.rst Outdated Show resolved Hide resolved

Update oeps/oep-0033-bp-code-autoformatting.rst

d5bccd6

Co-Authored-By: cpennington <calen.pennington@gmail.com>

fixup! Draft OEP-33: Code Autoformatting

56981d0

nedbat reviewed Mar 12, 2019

View reviewed changes

fixup! Draft OEP-33: Code Autoformatting

4514c89

nasthagiri reviewed Apr 10, 2019

View reviewed changes

stvstnfrd reviewed Apr 11, 2019

View reviewed changes

stvstnfrd suggested changes Apr 11, 2019

View reviewed changes

jmbowman reviewed May 28, 2019

View reviewed changes

xitij2000 mentioned this pull request May 28, 2021

feat: Add REST APIs for course advanced settings and course tabs [BD-38] [TNL-8625] [BB-4132] openedx/edx-platform#27608

Merged

2 tasks

stvstnfrd closed this Sep 10, 2021

Draft OEP-33: Code Autoformatting #107

Draft OEP-33: Code Autoformatting #107

Conversation

cpennington commented Mar 4, 2019

UmanShahzad left a comment

Choose a reason for hiding this comment

cpennington commented Mar 5, 2019

OmarIthawi commented Mar 11, 2019

cpennington commented Mar 11, 2019

OmarIthawi commented Mar 11, 2019

cpennington commented Mar 11, 2019

OmarIthawi commented Mar 11, 2019

stvstnfrd commented Mar 11, 2019

cpennington commented Mar 12, 2019

cpennington commented Mar 12, 2019

bradenmacdonald commented Mar 12, 2019

cpennington commented Mar 12, 2019

OmarIthawi commented Mar 12, 2019 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

nedbat commented Mar 12, 2019

cpennington commented Mar 12, 2019

cpennington commented Mar 12, 2019

cpennington commented Mar 12, 2019

davidjoy commented Mar 12, 2019 • edited

cpennington commented Mar 12, 2019

stvstnfrd commented Apr 1, 2019

cpennington commented Apr 2, 2019

cpennington commented Apr 8, 2019

cpennington commented Apr 9, 2019

nedbat commented Apr 9, 2019

cpennington commented Apr 9, 2019

cpennington commented Apr 9, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cpennington commented Apr 10, 2019

nasthagiri commented Apr 10, 2019

cpennington commented Apr 10, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

stvstnfrd left a comment • edited

Choose a reason for hiding this comment

Background, Stanford

Concerns, merge conflicts

Concerns, commit history

Concerns, onboarding

Concerns, implementation impact

Proposals

cpennington commented Apr 11, 2019

Concerns, merge conflicts

Concerns, commit history

Concerns, onboarding

Concerns, implementation impact

Proposals

cpennington commented Apr 24, 2019

Choose a reason for hiding this comment

tuchfarber commented Jan 13, 2021

stvstnfrd commented Sep 10, 2021

OmarIthawi commented Mar 12, 2019 •

edited

davidjoy commented Mar 12, 2019 •

edited

stvstnfrd left a comment •

edited