-
Notifications
You must be signed in to change notification settings - Fork 575
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Scrutineer: integrating opportunistic fault-localisation with PBT #2859
Conversation
e853b0c
to
64dbcc8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This feature looks amazing! I would love to try it, but I don't have any failing tests at hand 😆
@@ -71,7 +74,7 @@ with each phase corresponding to a value on the :class:`~hypothesis.Phase` enum: | |||
3. ``Phase.generate`` controls whether new examples will be generated. | |||
4. ``Phase.target`` controls whether examples will be mutated for targeting. | |||
5. ``Phase.shrink`` controls whether examples will be shrunk. | |||
|
|||
6. ``Phase.explain`` controls whether Hypothesis attempts to explain test failures. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would love to see some examples of how explain
results look like somewhere in the docs. Does it fit there?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was planning on showing it off in a blog post - there's nowhere obvious to put it in the docs, and the exact output format is pretty simple:
from hypothesis import Phase, given, strategies as st
@given(st.integers())
def test_reports_branch_in_test(x):
if x > 10:
raise AssertionError # BUG
(Obviously this is a toy example, but it's been useful on real projects too)
_________________________ test_reports_branch_in_test _________________________
Traceback (most recent call last):
...
AssertionError
--------------------------------- Hypothesis ----------------------------------
Falsifying example: test_reports_branch_in_test(
x=11,
)
Explanation:
These lines were always and only run by failing examples:
/path/to/test_file.py:6
One consideration in "what do we report" is that this format (usually) allows you to click on terminal output and have the relevant file open to that line in your preferred editor.
The alternative approach of reporting branches (source/destination pairs of lines) is only rarely more precise in practice, and much more difficult to explain to non-expert users. "report branches if there are no reportable lines" would be a nice trick to explore in future, though.
e4868af
to
136b1f3
Compare
assert len(expected) == code.count(BUG_MARKER) | ||
print(pytest_stdout) | ||
for report in expected: | ||
assert report in pytest_stdout |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe https://github.com/syrusakbary/snapshottest will be a good fit here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is great for testing the output! ⭐
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The catch is that we only want to test parts of the output, i.e. the explanation but not anything about the actual file paths.
(The other catch is that at present explain mode skips the tracing if sys.gettrace()
is not None... which makes it compatible with debuggers and also hides it from coverage
. Hmmm.)
Ugh. The actual implementation of C trace-functions is tricky enough that I think we actually can't reliably swap in the explain-tracer for e.g. the coverage tracer, not least due to decade-old CPython issues. |
56b4404
to
a895b9e
Compare
Being a basic-but-useful system for fault localisation.
@HypothesisWorks/hypothesis-python-contributors - final call for review! I'd love another set of eyes on this and an approving review, but I'll eventually merge it anyway if there are no objections 🙂 |
I'm currently working on a paper about integrating fault localisation with property-based testing - TLDR, it's pretty easy for PBT libraries to suggest where to start debugging test failures.
It turns out that my simple baseline version is also reliable, useful, and easy to interpret. On that basis I thought it makes sense to ship it, albeit disabled-by-default. I've tried this on a variety of toy examples and some real bugs in
black
; reviews and/or feedback on how this works for your problems would be most welcome 🙂Future plans for other PRs: adding fancier techniques, if they're reliable and fast and actually work better; and integration with generalised examples (#2192). As currently planned, all of these would go in a single "explain phase" - they share a purpose and workflow and therefore aren't individually configurable.