Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integrate pre-commit tool #28

Closed

Conversation

webknjaz
Copy link
Member

@webknjaz webknjaz commented Apr 20, 2018

pre-commit integration distilled from #23

@pradyunsg
Copy link
Member

I'll defer to others on whether we want this. :)

@webknjaz
Copy link
Member Author

@pradyunsg mind tagging them?

2.6/get-pip.py Outdated
@@ -135,7 +135,7 @@ def parse_args(self, args):
for arg in args:
try:
req = InstallRequirement.from_line(arg)
except:
except Exception:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unrelated change?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's related, cherry-picked from #23. Some of linters (flake8) was yelling at it. I think, adding linters should go along with cleaned code.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If they are fixes for linter issues, can they be put in a separate PR, please?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Of course, no problem.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pfmoore Done!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#30

3.2/get-pip.py Outdated
@@ -135,7 +135,7 @@ def parse_args(self, args):
for arg in args:
try:
req = InstallRequirement.from_line(arg)
except:
except Exception:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unrelated change?

3.3/get-pip.py Outdated
@@ -135,7 +135,7 @@ def parse_args(self, args):
for arg in args:
try:
req = InstallRequirement.from_line(arg)
except:
except Exception:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unrelated change?

@@ -135,7 +135,7 @@ def parse_args(self, args):
for arg in args:
try:
req = InstallRequirement.from_line(arg)
except:
except Exception:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unrelated change?

@@ -135,7 +135,7 @@ def parse_args(self, args):
for arg in args:
try:
req = InstallRequirement.from_line(arg)
except:
except Exception:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unrelated change?

@pfmoore
Copy link
Member

pfmoore commented Apr 24, 2018

The documentation on pre-commit was a bit hard to find (the PyPI description was "UNKNOWN" and the github readme was pretty minimal, although it did point to a project website, which looks extensive). One thing I did spot that bothers me a lot: "does not work on platforms without symlink support (windows)". I'm a strong -1 on including anything in the workflow that doesn't work on Windows.

Also, I'd want to see some documentation added here that explained how to use this. I don't want to direct project maintainers to go and read the whole of https://pre-commit.com/ to understand what's going on.

Also, I'm not a huge fan of mandatory pre-commit hooks. If I'm just making a trivial change (such as docs) I will often do so in an environment that doesn't have any dev requirements installed, on the expectation that CI will act as the fallback check. So I'd rather we have proper CI here before looking at mandating a certain setup on developer machines.

Overall, I'd need to be convinced of the benefits of this change before approving it.

@webknjaz
Copy link
Member Author

@pfmoore

I'm a strong -1 on including anything in the workflow that doesn't work on Windows.

I guess, it depends on linters/tools you add to config, but it works in general (@asottile, am I right?):

Support: python hooks work without any system-level depedendencies. It has been tested on linux, macOS, windows, and cygwin.


Also, I'm not a huge fan of mandatory

I didn't say it's mandatory. It's more like "use this command if you want to automatically run the same locally." If it's not clear from wording, the explanation should be fixed.


that CI will act as the fallback check.

We were trying to do this in #23, but @pradyunsg asked to have a separate PR for this. (2559f91)


benefits of this change

I personally like that it (1) allows to consolidate several linters in one config and (2) be able to run this in CI. Also (3) I like that there's possibility to automagically install it into your local repo (pre-commit or pre-push hook), which runs in special mode taking into account only changed files (as opposed to --run-all in CI).

After accepting this and #23 I can contribute moving it all into tox for example.


Also, I'd want to see some documentation

I'm going to do this once the general idea of having this tooling is accepted.

@@ -36,6 +36,13 @@ You may want to perform something like:
$ source venv/bin/activate
$ pip install -r requirements.txt

It is also recommended to make linters run to keep up with coding standards
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pfmoore any suggestion on wording improvements?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, not really. As I said, I'm not convinced we need linter checks on this project, so I'm hardly the one to say how you should word a recommendation that we use them 😄

@pfmoore
Copy link
Member

pfmoore commented Apr 24, 2018

I didn't say it's mandatory. It's more like "use this command if you want to automatically run the same locally."

Well, the way we do the test suite in pip (tests in py.test, run using tox, integrated into CI using simple Appveyor and Travis configs that just run tox) have the same benefit, and additionally, the arrangement is familiar to people who work on pip.

I'm not against trying out a new tool, but I personally have no experience with it and I've never seen any other project that uses it - so for me, at least, it has to prove its advantages over more standard approaches.

One other point, which I made in #18, is that we should test building and running get-pip from the pip tests. Changes in get-pip itself are very rare, and typically don't cause problems. The problems arise when changes in pip cause issues with get-pip, so I'd prefer to see the tests happening over there. That's not to say that we can't also have tests in get-pip itself, just that I don't think they really address the need in #18.

I'm going to do this once the general idea of having this tooling is accepted.

Chicken and egg situation here. Without docs on how this would affect our workflow, it's hard to know if this approach is acceptable. But it's your call - I appreciate that writing docs is a pain, particularly if there's a chance the work will be wasted. Maybe you could just give examples of some other Python projects that use this tool, that we could look at for reference?

We were trying to do this in #23, but @pradyunsg asked to have a separate PR for this. (2559f91)

In which case I'm confused, as the checks in here don't seem to be the same as the ones in #23. The fact that I have difficulty confirming that reinforces my feeling that maintaining these tests will be hard to do.

Also, the checks you're adding seem like a very odd list. Why is there a check-yaml test - we don't have any YAML files? Plus, by deferring the actual details of the checks to a 3rd party implementation, we lose control over what we test.

These tests all seem to be "lint" style tests, which aren't really that important (IMO) for get-pip, which is a very small, and unusual style of codebase (most of the code is an opaque binary blob). I'm not against code style checkers, but I'm frankly not that interested with adding them here.

@asottile
Copy link

asottile commented Apr 24, 2018

Ah, nothing like a flurry of github emails bright and early in the morning 😆 -- I'll try and address the comments above

The documentation on pre-commit was a bit hard to find (the PyPI description was "UNKNOWN" and the github readme was pretty minimal, although it did point to a project website, which looks extensive)

Ah yeah, warehouse kind of made this worse for us. From looking at traffic, most people land on the site by just googling "pre-commit". I might put some effort to at least get the README into the warehouse page now that it supports markdown (personally I really dislike rst).

One thing I did spot that bothers me a lot: "does not work on platforms without symlink support (windows)". I'm a strong -1 on including anything in the workflow that doesn't work on Windows.

This is a bit unfair, you've cherry-picked a comment about the non-administrative installation (READ: non-python installation). Not only is this not the suggested way to install pre-commit (pip!) it's our least used approach (based on pypi downloads, brew installation stats, and cloudflare stats for downloads of install-local.py). It'll probably be removed in the future but was mostly aimed at non-python developers who couldn't be bothered to set up pip / virtualenv / etc.

A lot of the point of pre-commit is that it's a cross-platform cross-language tool. It's heavily tested (and automatically tested on windows through appveyor). Its test coverage is exceptionally good. It is actively maintained. It's mature (just passed our 100th release!).

That said, there are a few platform shortfalls, though I don't think they come in to play here (see supported hook languages):

  • currently the ruby support is lacking on windows: Windows: Ruby Support pre-commit/pre-commit#201
  • swift is not supported on windows: (swift doesn't support windows)
  • docker is untested on windows: hasn't been hooked up in appveyor yet, though it should work without code changes

Seems the only hooks added in this patch are written in python and so you're good to go from a platform perspective 👍.

Also, I'd want to see some documentation added here that explained how to use this. I don't want to direct project maintainers to go and read the whole of https://pre-commit.com/ to understand what's going on.

Totally agree here, I usually suggest adding something to CONTRIBUTING.md (or whatever equivalent file) to help out newcomers. Here's a good example from the rtd project: https://docs.readthedocs.io/en/latest/contribute.html#contributing-to-development

Also, I'm not a huge fan of mandatory pre-commit hooks. If I'm just making a trivial change (such as docs) I will often do so in an environment that doesn't have any dev requirements installed, on the expectation that CI will act as the fallback check. So I'd rather we have proper CI here before looking at mandating a certain setup on developer machines.

Another part of the design is that they're entirely optional. If you want to enforce them, here's our suggestion for running in CI: https://pre-commit.com/#usage-in-continuous-integration

Well, the way we do the test suite in pip (tests in py.test, run using tox, integrated into CI using simple Appveyor and Travis configs that just run tox) have the same benefit, and additionally, the arrangement is familiar to people who work on pip.

pre-commit is easy to add to tox and replace other linting steps. Here's our documentation on this: https://pre-commit.com/#usage-with-tox

You can find an example of this here

I'm not against trying out a new tool, but I personally have no experience with it and I've never seen any other project that uses it - so for me, at least, it has to prove its advantages over more standard approaches.

Totally understand the qualms here, using a new tool is always the hardest :)

That said, most of the advantages come from the following:

  • you don't need to set up / configure all of your linters
  • it's all done in user space (no system packages need be installed)
  • pre-commit is better at invoking your linters than the linters are:
    • for example with flake8 you usually need to configure a complex exclude scheme (remove .git, .tox, any spare virtualenvs, etc.) -- since pre-commit is a VCS tool it knows which files are source or not. And if you do want to exclude them, there's good options for that via exclude (either at a global level or at a per-hook level).
    • pre-commit has extensive (and extensible) file type detection (for flake8 you'd typically need to configure additional extensions and such to enforce linting in .wsgi / wscript / etc.)
    • pre-commit knows how to lint extensionless files based on shebang (#!), often useful for scripts which are silently slipping through the cracks

As for the "I've never seen any other project that uses it", there's quite a few (on github: ~2300 repos at the time of writing). From referer logs there's a lot of large python shops using it internally as well and it's difficult to gauge the counts there (I'll respect their privacy and not disclose those!). One other metric I use to gauge adoption is clone count of pre-commit-hooks, while this isn't a great metric because pre-commit does a good job of caching the clones, it still approaches ~15k clones a week on a normal week.

If you want some "larger" projects which use pre-commit, here's a few:

(and these were just from a cursory search)

Note that I've already used pre-commit to contribute to pip itself before: pypa/pip#4921

Also, the checks you're adding seem like a very odd list. Why is there a check-yaml test - we don't have any YAML files?

Yeah, you probably don't need check-yaml (or a few of the other hooks that are added in this PR) :)

Plus, by deferring the actual details of the checks to a 3rd party implementation, we lose control over what we test.

heh, that's not all that different from deferring checks to flake8 or such. That said, most of the hooks are configurable (and if they're missing a check, it's easy to add one! or add a local hook with repo-specific checks. or create a site-specific hook repository to house sharable hooks).

conclusion

hopefully this clears some things up -- happy to clarify any of the points above as well.

@pfmoore
Copy link
Member

pfmoore commented Apr 24, 2018

@asottile Thanks for the detailed response! My apologies for the comment on Windows support - you're right, it was cherry-picked a little, although in my defence I was skimming looking for anything that talked about platform support and explicitly saying "we support Windows" and didn't find anything other than this (and I didn't really understand the specifics of what it was referring to). For background, this came up from a PR that was written using shell scripts for some new tests, so my concern was that the thinking behind the suggestion may not have considered cross-platform use. But I was taking an "assume the worst" approach, which isn't fair.

From what you're saying, I get the impression that pre-commit is essentially about lint style checks. In that case, I have a few comments:

  1. Ideally, we'd want to lint our templates, not our generated code. And our templates contain {variable_name} placeholders, which I guess could confuse linters. No idea how much of an issue that is in practice. And it's not directly a question about pre-commit.
  2. Is there a way to say "only check certain files" so that we could exclude the generated code? It's not that the generated files are OK to have problems, more that lint errors referring to the generated files tend to encourage people to fix the generated files, not fix the template and regenerate...
  3. I'm still far from convinced that we want automated linting on this repository in the first place, so this is likely not going to go anywhere in any case.

@asottile
Copy link

My apologies for the comment on Windows support

No problem :) I've invested an absolute ton of work to support windows, I guess I should call it out a bit more!

Is there a way to say "only check certain files" so that we could exclude the generated code? It's not that the generated files are OK to have problems, more that lint errors referring to the generated files tend to encourage people to fix the generated files, not fix the template and regenerate...

Yep! There's both exclude (for exclusion) and files (for inclusion)

@pfmoore
Copy link
Member

pfmoore commented Apr 24, 2018

My apologies for the comment on Windows support

No problem :) I've invested an absolute ton of work to support windows, I guess I should call it out a bit more!

You should actually just assume your readers aren't paranoid 😄 I suspect it's only refugees from the old OS wars like me that still assume the worst. Nearly every project I encounter these days either is, or is trying to be, cross-platform. You'd think I would have learned by now...

@webknjaz
Copy link
Member Author

webknjaz commented Apr 24, 2018

@pfmoore yes, there's a few hooks which are not applicable and are artifacts of copy-pasting stuff :) (yaml and probably a few of others)

Here's how we invoke it in tox: https://github.com/cherrypy/cheroot/blob/master/tox.ini#L30-L37
There's a separate env for running linters, which currently generate errors (for gradual fixing).

Thanks @asottile for those explanations btw ;)

With regard to whether linters are needed, it's simple: who wants their code to be diverse? Also linters help to prevent accidental errors, so why not?

Oh, and regarding templates: during work on #23 we had flake8 lint those and it didn't seem to cause problems.

@pfmoore
Copy link
Member

pfmoore commented Apr 24, 2018

This is getting a bit philosophical, so I don't intend to debate this much further, but

With regard to whether linters are needed, it's simple: who wants their code to be diverse? Also linters help to prevent accidental errors, so why not?

The "why not" is simple - because it encourages extra "keep the linter happy" commits. On most projects, that's essentially irrelevant, and the benefit of a consistent style (and an impartial arbiter to avoid "I prefer it this way" arguments) over a large codebase is worth it. On this project, where we essentially have only one Python file, and every commit to master is a release, and where most of our commits are direct to master, those extra commits are more of a concern. Not to the extent of being a showstopper, but enough that I'm personally inclined to be super-cautious. The other committers (particularly @dstufft who knows much better than me how this repo works) may feel differently.

@webknjaz
Copy link
Member Author

encourages extra "keep the linter happy" commits

Yes, but it might as well be non-mandatory, Travis CI allows making jobs allowed to fail, which will not fail the whole build, but will present the current status to everyone interested. From this point of view, it comes for no cost. I personally prefer if project I contribute to sets some style expectations and linters are the right tools for this, since it shouldn't be on humans.

I think, the most debatable thing is which linters to use, I've provided some example with this PR, but it doesn't mean that it's the best collection of checks possible.

Anyway, there's other things to contribute, like tox and CI/CD improvements, so I think this PR should not be blocker for those.

@pradyunsg
Copy link
Member

I agree with @pfmoore. This project doesn't need linter checks -- we basically only regenerate get-pip.py scripts and publish them here, and at this point, there's no active development / changes being made here that would benefit from having consistent code style etc. We can revisit this if we add a bunch more code here but I think I'll go ahead and close this for now. :)

Thanks for the PR @webknjaz!

@pradyunsg pradyunsg closed this Jul 29, 2020
@pradyunsg pradyunsg removed the request for review from dstufft July 29, 2020 18:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants