Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MAINT: add (optional) pre-commit hook #17812

Merged
merged 19 commits into from
Jan 29, 2023
Merged
Show file tree
Hide file tree
Changes from 12 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
25 changes: 16 additions & 9 deletions HACKING.rst.txt
Original file line number Diff line number Diff line change
Expand Up @@ -82,13 +82,22 @@ tests, benchmarks, and correct code style.
as it will appear online.

4. Code style
Uniformity of style in which code is written is important to others trying
to understand the code. SciPy follows the standard Python guidelines for
code style, `PEP8`_. In order to check that your code conforms to PEP8,
you can use the `pep8 package`_ style checker. Most IDEs and text editors
have settings that can help you follow PEP8, for example by translating
tabs by four spaces. Using `pyflakes`_ to check your code is also a good
idea. More information is available in :ref:`pep8-scipy`.
Uniform code style makes it easier for others to read your code.
SciPy follows the standard Python style guideline, `PEP8`_.

We provide a git pre-commit hook that can check each of your commits
for proper style. Install it (once) by running the following from
the root of the SciPy repository::

cp tools/pre-commit-hook.sh .git/hooks/pre-commit

Alternatively, you may run the linter manually::

python dev.py lint

Most IDEs and text editors also have settings that can help you
follow PEP8, for example by translating tabs by four spaces. More
information is available in :ref:`pep8-scipy`.

A :ref:`checklist<pr-checklist>`, including these and other requirements, is
available at the end of the example :ref:`development-workflow`.
Expand Down Expand Up @@ -259,8 +268,6 @@ improvements, and submit your first PR!

.. _pep8 package: https://pypi.python.org/pypi/pep8

.. _pyflakes: https://pypi.python.org/pypi/pyflakes

.. _Github help pages: https://help.github.com/articles/set-up-git/

.. _issue list: https://github.com/scipy/scipy/issues
Expand Down
5 changes: 2 additions & 3 deletions azure-pipelines.yml
Original file line number Diff line number Diff line change
Expand Up @@ -144,8 +144,7 @@ stages:
addToPath: true
architecture: 'x64'
- script: >-
python -m pip install
"flake8<6"
python -m pip install ruff cython-lint
displayName: 'Install tools'
failOnStderr: false
- script: |
Expand All @@ -156,7 +155,7 @@ stages:
# so does not appear to be needed to fetch maintenance
# branches (which Azure logs show being checked out
# in Checkout stage)
python tools/lint_diff.py --branch origin/$(System.PullRequest.TargetBranch)
python tools/lint.py --diff-against origin/$(System.PullRequest.TargetBranch)
tools/unicode-check.py
tools/check_test_name.py
displayName: 'Run Lint Checks'
Expand Down
43 changes: 11 additions & 32 deletions dev.py
Original file line number Diff line number Diff line change
Expand Up @@ -123,7 +123,6 @@ def task_meta(cls, **kwargs):

import click
from click import Option, Argument
from doit import task_params
from doit.cmd_base import ModuleTaskLoader
from doit.reporter import ZeroReporter
from doit.exceptions import TaskError
Expand Down Expand Up @@ -269,7 +268,7 @@ def cli(ctx, **kwargs):

\b**python dev.py --build-dir my-build test -s stats**

"""
""" # noqa: E501
CLI.update_context(ctx, kwargs)


Expand Down Expand Up @@ -558,7 +557,7 @@ def copy_openblas(cls, dirs):
default `_distributor_init.py` file with the one
we use for wheels uploaded to PyPI so that DLL gets loaded.

Assumes pkg-config is installed and aware of OpenBLAS.
Assumes pkg-config is installed and aware of OpenBLAS.

The "dirs" parameter is typically a "Dirs" object with the
structure as the following, say, if dev.py is run from the
Expand Down Expand Up @@ -652,7 +651,7 @@ class Test(Task):
$ python dev.py test -s cluster -m full --durations 20
$ python dev.py test -s stats -- --tb=line # `--` passes next args to pytest
```
"""
""" # noqa: E501
ctx = CONTEXT

verbose = Option(
Expand Down Expand Up @@ -889,29 +888,14 @@ def emit_cmdstr(cmd):
console.print(f"{EMOJI.cmd} [cmd] {cmd}")


@task_params([{'name': 'output_file', 'long': 'output-file', 'default': None,
'help': 'Redirect report to a file'}])
def task_flake8(output_file):
"""Run flake8 over the code base and benchmarks."""
opts = ''
if output_file:
opts += f'--output-file={output_file}'

cmd = f"flake8 {opts} scipy benchmarks/benchmarks"
emit_cmdstr(f"{cmd}")
return {
'actions': [cmd],
'doc': 'Lint scipy and benchmarks directory',
}


def task_pep8diff():
def task_lint():
# Lint just the diff since branching off of main using a
# stricter configuration.
emit_cmdstr(os.path.join('tools', 'lint_diff.py'))
emit_cmdstr(os.path.join('tools', 'lint.py') + ' --diff-against main')
return {
'basename': 'pep8-diff',
'actions': [str(Dirs().root / 'tools' / 'lint_diff.py')],
'basename': 'lint',
'actions': [str(Dirs().root / 'tools' / 'lint.py') +
' --diff-against=main'],
'doc': 'Lint only files modified since last commit (stricter rules)',
}

Expand All @@ -937,16 +921,11 @@ def task_check_test_name():

@cli.cls_cmd('lint')
class Lint():
""":dash: Run flake8, check PEP 8 compliance on branch diff and check for
""":dash: Run linter on modified files and check for
disallowed Unicode characters and possibly-invalid test names."""
output_file = Option(
['--output-file'], default=None, help='Redirect report to a file')

def run(output_file):
opts = {'output_file': output_file}
def run():
run_doit_task({
'flake8': opts,
'pep8-diff': {},
'lint': {},
'unicode-check': {},
'check-testname': {},
})
Expand Down
37 changes: 18 additions & 19 deletions doc/source/dev/contributor/pep8.rst
Original file line number Diff line number Diff line change
Expand Up @@ -19,30 +19,32 @@ compliance before pushing your code:
Editor |rarr| Advanced Settings. This can help you fix PEP8 issues as you
write your code.

- SciPy makes use of special configuration files for linting with the
|flake8|_ tool.
- Note, however, that SciPy's linting configuration may not match
that of your IDE exactly. See below on how to run the official
checks.

- It is typically recommended to leave any existing style issues alone
unless they are part of the code you're already modifying.
- It is recommended to leave existing style issues alone
unless they exist in files you are already modifying.
This practice ensures that the codebase is gradually cleaned up
without dedicating precious review time to style-only cleanups.
Before sending a Pull Request, we suggest running the lint tests only
for the changes you've made in your feature branch. This will mimic
the continuous integration linting checks setup on GitHub.
After installing ``flake8``, you can run the following check locally
in the SciPy root directory to ensure your Pull Request doesn't
break the Continuous Integration linting tests.

::
- Before sending a Pull Request, run the linter on changes made in
your feature branch. The checks will also be made during
continuous integration, but it's quicker to catch them early.

python tools/lint_diff.py
The easiest way to do so is to install our pre-commit hook (once)::

If you want to run the diff based lint tests only for specific files
or directories, please consider using the ``--files`` option.
cp tools/pre-commit-hook.sh .git/hooks/pre-commit

::
This will run linting checks before each commit is made.

python tools/lint_diff.py --files scipy/odr/models.py scipy/ndimage
Alternatively, you can run the check manually from the SciPy root directory::

python dev.py lint

You can also run the linter on specific files, using the ``--files`` option::

python tools/lint.py --files scipy/odr/models.py scipy/ndimage

- If you have existing code with a lot of PEP8 issues, consider using
|autopep8|_ to automatically fix them before incorporating the code into
Expand All @@ -51,9 +53,6 @@ compliance before pushing your code:
.. _PEP8: https://www.python.org/dev/peps/pep-0008/
.. _enable Real-time code style analysis: https://stackoverflow.com/questions/51463223/how-to-use-pep8-module-using-spyder

.. |flake8| replace:: ``flake8``
.. _flake8: http://flake8.pycqa.org/en/latest/

.. |autopep8| replace:: ``autopep8``
.. _autopep8: https://pypi.org/project/autopep8/0.8/

Expand Down
5 changes: 4 additions & 1 deletion environment.yml
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,6 @@ dependencies:
- pydata-sphinx-theme==0.9.0
- sphinx-design
# For linting
- flake8 < 6.0
# Some optional test dependencies
- mpmath
- gmpy2
Expand All @@ -49,3 +48,7 @@ dependencies:
- click
- doit>=0.36.0
- pydevtool
- pip
- pip:
- ruff
- cython-lint
3 changes: 2 additions & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -121,7 +121,8 @@ dev = [
"mypy",
"typing_extensions",
"pycodestyle",
"flake8",
"ruff",
"cython-lint",
"rich-click",
"click",
"doit>=0.36.0",
Expand Down
138 changes: 138 additions & 0 deletions tools/lint.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,138 @@
#!/usr/bin/env python
import os
import sys
import subprocess
from argparse import ArgumentParser


CONFIG = os.path.join(
os.path.abspath(os.path.dirname(__file__)),
'lint.ini',
)


def rev_list(branch, num_commits):
"""List commits in reverse chronological order.

Only the first `num_commits` are shown.

"""
res = subprocess.run(
[
'git',
'rev-list',
'--max-count',
f'{num_commits}',
'--first-parent',
branch
],
stdout=subprocess.PIPE,
encoding='utf-8',
)
res.check_returncode()
return res.stdout.rstrip('\n').split('\n')


def find_branch_point(branch):
"""Find when the current branch split off from the given branch.

It is based off of this Stackoverflow post:

https://stackoverflow.com/questions/1527234/finding-a-branch-point-with-git#4991675

"""
branch_commits = rev_list('HEAD', 1000)
main_commits = set(rev_list(branch, 1000))
for branch_commit in branch_commits:
if branch_commit in main_commits:
return branch_commit

# If a branch split off over 1000 commits ago we will fail to find
# the ancestor.
raise RuntimeError(
'Failed to find a common ancestor in the last 1000 commits'
)


def diff_files(sha):
"""Find the diff since the given SHA."""
res = subprocess.run(
['git', 'diff', '--name-only', '--diff-filter=ACMR', '-z', sha, '--',
'*.py', '*.pyx', '*.pxd', '*.pxi'],
stdout=subprocess.PIPE,
encoding='utf-8'
)
res.check_returncode()
return [f for f in res.stdout.split('\0') if f]


def run_ruff(files, fix):
if not files:
return 0, ""
args = ['--fix'] if fix else []
res = subprocess.run(
['ruff'] + args + files,
stdout=subprocess.PIPE,
encoding='utf-8'
)
return res.returncode, res.stdout


def run_cython_lint(files):
if not files:
return 0, ""
res = subprocess.run(
['cython-lint'] + files,
stdout=subprocess.PIPE,
encoding='utf-8'
)
return res.returncode, res.stdout


def main():
parser = ArgumentParser(description="Also see `pre-commit-hook.sh` which "
"lints all files staged in git.")
# In Python 3.9, can use: argparse.BooleanOptionalAction
parser.add_argument("--fix", action='store_true',
help='Attempt to fix linting violations')
parser.add_argument("--diff-against", dest='branch',
type=str, default=None,
help="Diff against "
"this branch and lint modified files. Use either "
"`--diff-against` or `--files`, but not both.")
parser.add_argument("--files", nargs='*',
help="Lint these files or directories; "
"use **/*.py to lint all files")

args = parser.parse_args()

if not ((args.files is None) ^ (args.branch is None)):
print('Specify either `--diff-against` or `--files`. Aborting.')
sys.exit(1)

if args.branch:
branch_point = find_branch_point(args.branch)
files = diff_files(branch_point)
else:
files = args.files

cython_exts = ['.pyx'] # Add '.pxd', '.pxi' once cython-lint supports it
cython_files = [f for f in files if any(f.endswith(ext) for ext in cython_exts)]
other_files = [f for f in files if not f.endswith('.pyx')]

rc_cy, errors = run_cython_lint(cython_files)
if errors:
print(errors)

rc, errors = run_ruff(other_files, fix=args.fix)
if errors:
print(errors)

if rc == 0 and rc_cy != 0:
rc = rc_cy

sys.exit(rc)


if __name__ == '__main__':
main()
24 changes: 24 additions & 0 deletions tools/lint.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
[tool.ruff]
line-length = 88

# Enable Pyflakes `E` and `F` codes by default.
select = ["E", "F"]
ignore = [
"E121", "E122", "E123", "E125", "E126", "E127", "E128", "E226",
"E251", "E265", "E266", "E302", "E402", "E712", "E721", "E731",
"E741", "W291", "W293", "W391", "W503", "W504"
]
excludes = [
'scipy/datasets/_registry.py'
]
per-file-ignores = {'**/__init.py', ['F401', 'F403']}

# Allow unused variables when underscore-prefixed.
dummy-variable-rgx = "^(_+|(_+[a-zA-Z0-9_]*[a-zA-Z0-9]+?))$"

# Assume Python 3.10.
target-version = "py38"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we update the target-version to match the comment/newer Python above?

One more question about this file--where is lint.toml used? Is it just that you can use ruff with any .toml file at appropriate location?

A quick browse of their GitHub page used other filenames so wasn't sure:

As an alternative to pyproject.toml, Ruff will also respect a ruff.toml file, which implements an equivalent schema (though the [tool.ruff] hierarchy can be omitted)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, Tyler. lint.py explicitly sets --config=lint.toml.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Recommended entrypoints for the linter are: python dev.py lint or tools/lint.py.

I think if we wanted it to pick up ruff.toml by default, we'd have to move it to the root of the repo, or include its config in our pyproject.toml.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I didn't catch that in the diff--just to confirm, this is separate from the usage of lint.ini in tools/lint.py? Or does it remap to to the toml file from that?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Scratch all my comments: ruff expects ruff.toml (even if --config is used), and lint.py didn't specify it correctly.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately we do need the file to either be named ruff.toml or pyproject.toml. Otherwise, we have no way of knowing which schema to enforce. (The schemas are the ~same between ruff.toml and pyproject.toml, except that all the Ruff settings are nested under [tool.ruff].)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @charliermarsh. It sounds like you may need two config options then, --config and --config-pyproject or similar. I think it's quite confusing not to honor the --config option just because the file is named incorrectly.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As an alternative, if the file isn't named pyproject.toml, I could just treat it as in ruff.toml format. That would at least be an improvement, I think, though would still have the potential to cause confusion as you describe.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@stefanv - This is changed in the next release (anything not named pyproject.toml is assumed to be in ruff.toml format).


[tool.ruff.mccabe]
# Unlike Flake8, default to a complexity level of 10.
max-complexity = 10