New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactoring ConjectureData
to support symbolic execution
#3086
Comments
Come to think of it, this would also work well for Atheris' |
So! I am going to start poking at this myself, starting from Zac's branch. At least in a time-boxed, explorative kind of way. @Zac-HD Any new developments that I should be aware of? Any chance you'd have some time to do incremental reviews along the way? |
I am absolutely up for code reviews; the only other news is that I'm hoping for some volunteers at the PyCon US sprints next week. Integers and bools are a good place to start; floats need some refactoring work which is probably easier beforehnad. |
Oh right, PyCon is happening! Happy to take a look a float refactoring too if you wanna give me some hints on what needs to be done. At first glance, it looks like we won't be able to just replace |
And we can indeed replace it everywhere, because |
It needs some cleanup still, but early feedback welcome in this draft. |
Float refactoring was merged! Sadly, I've already exhausted my timebox for this round, but there's a decent chance I'll pop my head back up and continue to push on this later. |
@DRMacIver and I have been talking about an equivalent refactoring, motivated by the opportunities for reduced redundancy, smarter shrinking and mutation (e.g. for Inquisitor), symbolic/concolic execution support, and reading this recent paper on 'reflective generators' (which are infeasible in general, but could work at the low-level interface). The exact shape of the interface is still undecided, but we're thinking of changing the underlying data structure from a bytearray to a tree, where the leaves are (int, float, bytes, str) instances and also track draw order so that we can mutate as if it was a flat sequence by causal order in addition to e.g. depth-first traversal. The low-level API might look something like: class LowLevelProvider:
# This is the low-level interface which would also be implemented
# by e.g. CrossHair, by an Atheris-hypothesis integration, etc.
# We'd then build the structured tree handling, database and replay
# support, etc. on top of this - so all backends get those for free.
def draw_integer(
self,
*,
forced: int | None = None,
min_value: int | None = None,
max_value: int | None = None,
weights: Sequence[Real] | None = None, # for small bounded ranges
):
if weights is not None:
assert min_value is not None
assert max_value is not None
assert (max_value - min_value) <= 1024 # arbitrary practical limit
if forced is not None:
assert min_value is None or min_value <= forced
assert max_value is None or forced <= max_value
# This is easy to build on top of our existing conjecture utils,
# and it's easy to build sampled_from and weighted_coin on this.
...
def draw_float(
self,
*,
forced: float | None = None,
min_value: float | None = None,
max_value: float | None = None,
allow_nan: bool = True,
allow_infinity: bool = True,
allow_subnormal: bool = True,
width: Literal[16, 32, 64] = 64,
# exclude_min and exclude_max handled higher up
):
...
def draw_string(
self,
*,
forced: str | None = None,
# Should we use `regex: str = ".*"` instead of alphabet and sizes?
alphabet: ... = ...,
min_size: int = 0,
max_size: int | None = None,
):
...
def draw_bytes(
self,
*,
forced: bytes | None = None,
min_size: int = 0,
max_size: int | None = None,
):
... |
I wrote a quick spike: master...Zac-HD:hypothesis:lowlevel-provider - just porting over The bad news is that changing our fundamental data structures for the engine is going to be a huge project; the good news is that it's feasible to both do it and ship it in smaller pieces. For example:
|
Earlier this evening, @pschanely and I found a synthetic bug using a new This was rather exciting, given that @tybug's PR for step 1 isn't quite finished yet, and Phillip had implemented the provider earlier in the day - but our totally untested hackjob of a demonstration spike did in fact find a bug with Next steps: once #3788 is merged, both (2) and also (3) can proceed concurrently. I (or another volunteer!) will define Hypothesis-side hooks for integration and a way to configure which backed to use, so that Crosshair can register as a Hypothesis plugin and automatically set that up. Phillip will clean up the Crosshair side of our demo, and we'll collaborate to ensure that Crosshair can be used on Hypothesis tests even before we finish up this issue and are ready for Hypothesis to use Crosshair as a backend. It's been a fantastic week. Onwards! |
We've merged and released part (1) of the plan above! Next up, I'll work out how to add some nice-ish hooks for Crosshair and similar tools to plug themselves in as backends - and emit warnings until we get the other parts done. @tybug, do you want to continue on to (3)? |
Yup, I'll tackle that next. I expect it will take me some time to wrap my head around the existing tree logic (but this is something I need a solid understanding of regardless, so that's ok!). My current high-level understanding is that I'm not sure if there will be finicky things involving e.g. computing |
That sounds right to me! Note that we'll need to implement forcing for most of the primitives; iirc we currently only have that for bounded ints and bools. Maybe make an independent PR just with that; we'd then be able to assimilate stuff found by crosshair et al even before moving over the shrinker and database format. |
@pschanely: master...Zac-HD:hypothesis:provider-plugins has a draft of the pluggable hooks. I'm still considering whether |
Super exciting. I've been working on a variety of compatibility issues since last week; I'm making good progress, but there are still things to investigate. I see now how you're getting around the bits that want to replay an execution - just skip them! (locally, I noticed another replay spot here that CrossHair dies on) In the meantime, I have another issue ... that I probably should have foreseen. It isn't always safe to operate on symbolics outside the context manager. Integers like the one we tried will mostly work, but other things (strings especially) will blow up unless I have the appropriate interpreter intercepts installed. There are a few different ways that symbolics may outlive the test run:
Annnd, some potential solutions for each of the above:
Thoughts? |
aha - that's exactly the replay spot I also identified, though I still need to wrap it in the context manager. We'll probably just have some more back-and-forth like this leading up to a coordinated release 🙂 Sounds like realizing on the way out of the context manager will solve (1) and (2); IMO simply not solving (3) is reasonable since it's a user error already, and if anything else comes up for (4) I'll post again here. |
Sounds good on not thinking about (3) yet. I took a little time to try and write something that would connect up with your provider-plugins branch, here. Ready to work on these in parallel! In addition to the context managers, I'll need a way to share CrossHair's RootNode state in between runs for the same test, so maybe think about how you'd like that to happen. (maybe jam something onto the runner if I can access that or something?) |
I've added this to the branch and opened a WIP pr - by having a global monkeypatchable function which returns a context manager (to wrap all runs for a test) which yields a function that returns a second context manager (to wrap each test case). Please forgive the other changes there, I'm in 'move fast and break things' mode while we iterate towards a workable design. |
This seems like it calls for a real API design, since there might be multiple backends fighting over the hacky monkeypatchable thing😅 For now, checking whether |
Closing this in favor of #3914, since we've shipped an initial implementation ✨ |
Crosshair, by @pschanely, is a symbolic execution library, which can run
@given()
tests... inefficiently. When it works though, it's much better at finding bugs which only occur for specific or very rare inputs.This issue is about how to refactor the
ConjectureData
class andconjecture.utils
module - the latter containing several functions which should beConjectureData
methods. This is worth doing as a cleanup anyway, but provides the ideal integration point for Crosshair and in future can also help us detect redundant examples (i.e. #1986, seeconjecture.datatree
).The core idea is to have strategies all get their data via type-specific methods, such as:
And then to symbolically execute some test, pass a
SymDataProvider()
instance to a new hook similar to.fuzz_one_input
(see e.g. master...Zac-HD:symex-refactoring). That's it - database support etc. should all be included for free via thewrite_*
methods!The text was updated successfully, but these errors were encountered: