Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH add --negate flag to pygrep #1643

Merged
merged 1 commit into from Oct 17, 2020

Conversation

MarcoGorelli
Copy link
Contributor

@MarcoGorelli MarcoGorelli commented Oct 14, 2020

xref this comment on StackOverflow: https://stackoverflow.com/a/64344519/9988333

A PR may be considered if you can demonstrate a real use case for it

A use-case comes from PyMC3, where there is a pre-commit hook to check that all notebooks have a watermark. Currently, this is done by defining

- repo: local
  hooks:
    - id: watermark
      name: Check notebooks have watermark
      types: [jupyter]
      entry: python scripts/check_watermark.py
      language: python

in https://github.com/pymc-devs/pymc3/blob/master/.pre-commit-config.yaml, and then

"""
Check that given Jupyter notebooks all contain a final watermark cell to facilite reproducibility.
This is intended to be used as a pre-commit hook, see `.pre-commit-config.yaml`.
You can run it manually with `pre-commit run watermark --all`.
"""

import argparse
from pathlib import Path
import re

if __name__ == "__main__":
    parser = argparse.ArgumentParser()
    parser.add_argument("filenames", nargs="*")
    args = parser.parse_args()
    for file_ in args.filenames:
        assert (
            re.search(
                r"%load_ext watermark.*%watermark -n -u -v -iv -w",
                Path(file_).read_text(),
                flags=re.DOTALL,
            )
            is not None
        ), (
            f"Watermark not found in {file_} - please see the PyMC3 Jupyter Notebook Style guide:\n"
            "https://github.com/pymc-devs/pymc3/wiki/PyMC's-Jupyter-Notebook-Style"
        )

in https://github.com/pymc-devs/pymc3/blob/master/scripts/check_watermark.py .


I think the output should be a file which does not contain the pattern...I'll work on this tomorrow

pre_commit/languages/pygrep.py Outdated Show resolved Hide resolved
tests/languages/pygrep_test.py Outdated Show resolved Hide resolved
@MarcoGorelli MarcoGorelli marked this pull request as draft October 14, 2020 17:34
@MarcoGorelli MarcoGorelli marked this pull request as ready for review October 14, 2020 19:06
@MarcoGorelli
Copy link
Contributor Author

Marking as ready for review as I'd like to think this is enough to tell whether it's the right direction / whether the feature would be welcome. Thanks for the 'don't put logic in tests' link!

pre_commit/languages/pygrep.py Outdated Show resolved Hide resolved
for line in f:
if pattern.search(line):
retv = 0
break
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this can be return 0

and then below you can

else:
    output.write_line(filename)
    return 1

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, much simpler, thanks!

Though here there wouldn't be an else, would there? It's necessary to check through all the lines before deciding to return 1

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I see, thanks! I see you have a video about this too, will watch now https://youtu.be/8P7lXLXR_3c

retv |= _process_filename_at_once(pattern, filename)
else:
retv |= _process_filename_by_line(pattern, filename)
process_fn = FNS[Choice(multiline=args.multiline, negate=args.negate)]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should be outside the loop, otherwise you're doing this work every iteration instead of just once

Copy link
Member

@asottile asottile left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@MarcoGorelli
Copy link
Contributor Author

Thank you so much for the mentorship you provided me here, I really appreciate it, it's very kind of you!

@asottile asottile merged commit 01f1a00 into pre-commit:master Oct 17, 2020
@MarcoGorelli MarcoGorelli deleted the negate-pygrep branch October 17, 2020 18:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants