Refactoring `ConjectureData` to support symbolic execution #3086

Zac-HD · 2021-09-08T11:16:04Z

Crosshair, by @pschanely, is a symbolic execution library, which can run @given() tests... inefficiently. When it works though, it's much better at finding bugs which only occur for specific or very rare inputs.

This issue is about how to refactor the ConjectureData class and conjecture.utils module - the latter containing several functions which should be ConjectureData methods. This is worth doing as a cleanup anyway, but provides the ideal integration point for Crosshair and in future can also help us detect redundant examples (i.e. #1986, see conjecture.datatree).

The core idea is to have strategies all get their data via type-specific methods, such as:

class DataProvider:
    # A thin wrapper around ConjectureData, to reduce the API surface

    def draw_integer(self):
        # cu.unbounded_integers()
    def write_integer(self, *, value):
        # extend self.buffer, as if we had passed force= to draw_bits

    def draw_integer_range(self, min_value, max_value):
        # cu.integer_range()
    def write_integer_range(self, min_value, max_value, *, value):
        # extend self.buffer, as if we had passed force= to draw_bits

    def draw_boolean(self, p=0.5):
        # cu.biased_coin()  (cu.boolean is trivial)

    # ... you get the idea.  Later add floats, bytestrings, and maybe even unicode.

class SymDataProvider(DataProvider):
    def draw_integer(self):
        n = a_symbolic_int()
        self.write_integer(n)  # should be OK to add symbolic values to the buffer
        return n               # because bitsize is fixed conditional on values of earlier draws

    def draw_integer_range(self, min_value, max_value):
        n = a_symbolic_int(min_value, max_value)
        self.write_integer(min_value, max_value, value=n)
        return n

    # etc.

And then to symbolically execute some test, pass a SymDataProvider() instance to a new hook similar to .fuzz_one_input (see e.g. master...Zac-HD:symex-refactoring). That's it - database support etc. should all be included for free via the write_* methods!

The text was updated successfully, but these errors were encountered:

Zac-HD · 2022-02-21T04:45:54Z

Come to think of it, this would also work well for Atheris' FuzzDataProvider 🤔

pschanely · 2022-04-28T19:20:00Z

So! I am going to start poking at this myself, starting from Zac's branch. At least in a time-boxed, explorative kind of way.

@Zac-HD Any new developments that I should be aware of? Any chance you'd have some time to do incremental reviews along the way?

Zac-HD · 2022-04-28T19:25:47Z

I am absolutely up for code reviews; the only other news is that I'm hoping for some volunteers at the PyCon US sprints next week. Integers and bools are a good place to start; floats need some refactoring work which is probably easier beforehnad.

pschanely · 2022-04-29T14:35:48Z

Oh right, PyCon is happening! Happy to take a look a float refactoring too if you wanna give me some hints on what needs to be done.

At first glance, it looks like we won't be able to just replace ConjectureData with DataProvider in every SearchStrategy.do_draw (because of 3rd party strategies, at a minimum). Right? We'll introduce a new method perhaps? Your ideas about how to do this incrementally would be great.

Zac-HD · 2022-04-29T15:25:38Z

floats() refactoring: instead of having the floats() function dispatch to either FloatsStrategy or FixedBoundedFloatsStrategy, I'd like to do that dispatch (perhaps via memoized helper functions) inside FloatsStrategy.do_draw(). #2878 is where I did this for integers(). That would in turn make it easy to do filter-rewriting, and provides the obvious place to insert a DataProvider shim 😉

And we can indeed replace it everywhere, because SearchStrategy.do_draw() is private API, and everyone else should just be composing public strategies.

pschanely · 2022-05-02T01:30:57Z

floats() refactoring: instead of having the floats() function dispatch to either FloatsStrategy or FixedBoundedFloatsStrategy, I'd like to do that dispatch (perhaps via memoized helper functions) inside FloatsStrategy.do_draw(). #2878 is where I did this for integers(). That would in turn make it easy to do filter-rewriting, and provides the obvious place to insert a DataProvider shim 😉

It needs some cleanup still, but early feedback welcome in this draft.

pschanely · 2022-05-25T12:44:05Z

It needs some cleanup still, but early feedback welcome in this draft.

Float refactoring was merged! Sadly, I've already exhausted my timebox for this round, but there's a decent chance I'll pop my head back up and continue to push on this later.

Zac-HD · 2023-10-22T23:42:08Z

@DRMacIver and I have been talking about an equivalent refactoring, motivated by the opportunities for reduced redundancy, smarter shrinking and mutation (e.g. for Inquisitor), symbolic/concolic execution support, and reading this recent paper on 'reflective generators' (which are infeasible in general, but could work at the low-level interface).

The exact shape of the interface is still undecided, but we're thinking of changing the underlying data structure from a bytearray to a tree, where the leaves are (int, float, bytes, str) instances and also track draw order so that we can mutate as if it was a flat sequence by causal order in addition to e.g. depth-first traversal. The low-level API might look something like:

class LowLevelProvider:
    # This is the low-level interface which would also be implemented
    # by e.g. CrossHair, by an Atheris-hypothesis integration, etc.
    # We'd then build the structured tree handling, database and replay
    # support, etc. on top of this - so all backends get those for free.

    def draw_integer(
        self,
        *,
        forced: int | None = None,
        min_value: int | None = None,
        max_value: int | None = None,
        weights: Sequence[Real] | None = None,  # for small bounded ranges
    ):
        if weights is not None:
            assert min_value is not None
            assert max_value is not None
            assert (max_value - min_value) <= 1024  # arbitrary practical limit
        if forced is not None:
            assert min_value is None or min_value <= forced
            assert max_value is None or forced <= max_value
        # This is easy to build on top of our existing conjecture utils,
        # and it's easy to build sampled_from and weighted_coin on this.
        ...

    def draw_float(
        self,
        *,
        forced: float | None = None,
        min_value: float | None = None,
        max_value: float | None = None,
        allow_nan: bool = True,
        allow_infinity: bool = True,
        allow_subnormal: bool = True,
        width: Literal[16, 32, 64] = 64,
        # exclude_min and exclude_max handled higher up
    ):
        ...

    def draw_string(
        self,
        *,
        forced: str | None = None,
        # Should we use `regex: str = ".*"` instead of alphabet and sizes?
        alphabet: ... = ...,
        min_size: int = 0,
        max_size: int | None = None,
    ):
        ...

    def draw_bytes(
        self,
        *,
        forced: bytes | None = None,
        min_size: int = 0,
        max_size: int | None = None,
    ):
        ...

Zac-HD · 2023-11-06T03:08:04Z

I wrote a quick spike: master...Zac-HD:hypothesis:lowlevel-provider - just porting over IntegersStrategy.do_draw() for now, but most cover tests already pass (exceptions: some stateful-testing stuff) so I think it's promising.

The bad news is that changing our fundamental data structures for the engine is going to be a huge project; the good news is that it's feasible to both do it and ship it in smaller pieces. For example:

Give ConjectureData a PrimitiveProvider, and make all draws go via methods which delegate to the PrimitiveProvider. Make minimal changes to the internals at this point; we just want to get interface worked out and move the utils to be methods on PrimitiveProvider.
Ensure that PrimitiveProvider can in fact be swapped out for an alternative backend - Provider should make no assumptions about the implementation. At this point we should be able to find a bug using a CrosshairPrimitiveProvider, although replay using Hypothesis itself (which includes shrinking and database support) will come in the last step below.
Port TreeObserver, generate_novel_prefix() et al to use the new IR, reducing redundancy (e.g. generate_novel_prefix interacts poorly with biased_coin (and lists) #1574)... if it is in fact easier to do this rather than a big-bang migration of everything at once.
Change our database format and shrink logic to use a sequence of primitive types instead of a bytestring (e.g. Change the core draw primitive to be selecting an integer from 0 to n #1986). The final stage buys us complete interoperability with other backends, and probably also some large performance benefits.

Zac-HD · 2023-11-17T05:16:47Z

Earlier this evening, @pschanely and I found a synthetic bug using a new CrosshairPrimitiveProvider.

This was rather exciting, given that @tybug's PR for step 1 isn't quite finished yet, and Phillip had implemented the provider earlier in the day - but our totally untested hackjob of a demonstration spike did in fact find a bug with pytest test_demo.py using Crosshair as a backend to Hypothesis' own internals.

Next steps: once #3788 is merged, both (2) and also (3) can proceed concurrently. I (or another volunteer!) will define Hypothesis-side hooks for integration and a way to configure which backed to use, so that Crosshair can register as a Hypothesis plugin and automatically set that up. Phillip will clean up the Crosshair side of our demo, and we'll collaborate to ensure that Crosshair can be used on Hypothesis tests even before we finish up this issue and are ready for Hypothesis to use Crosshair as a backend.

It's been a fantastic week. Onwards!

Zac-HD · 2023-11-20T04:52:25Z

We've merged and released part (1) of the plan above! Next up, I'll work out how to add some nice-ish hooks for Crosshair and similar tools to plug themselves in as backends - and emit warnings until we get the other parts done. @tybug, do you want to continue on to (3)?

tybug · 2023-11-20T05:13:45Z

Yup, I'll tackle that next. I expect it will take me some time to wrap my head around the existing tree logic (but this is something I need a solid understanding of regardless, so that's ok!).

My current high-level understanding is that generate_novel_prefix actually deals with a tree right now as well, with the exception that the nodes are individual calls to draw_bits. I'm expecting to change this representation to have 5 intermediate node types instead. Generating a novel prefix would involve traversing the tree and forcing as normal, and drawing a new value from that particular node type when reaching a node that is not forced and not exhausted.

I'm not sure if there will be finicky things involving e.g. computing is_exhausted with this new IR yet - a task for future me 🙂

Zac-HD · 2023-11-20T19:19:53Z

That sounds right to me!

Note that we'll need to implement forcing for most of the primitives; iirc we currently only have that for bounded ints and bools. Maybe make an independent PR just with that; we'd then be able to assimilate stuff found by crosshair et al even before moving over the shrinker and database format.

Zac-HD · 2023-11-21T08:21:17Z

@pschanely: master...Zac-HD:hypothesis:provider-plugins has a draft of the pluggable hooks. I'm still considering whether register_new_backend() should live directly in the hypothesis namespace (probably yes, and just make it emit a warning for now?) and the docs are... rough... but this should be enough that you can test the crosshair changes.

pschanely · 2023-11-21T22:31:40Z

Super exciting. I've been working on a variety of compatibility issues since last week; I'm making good progress, but there are still things to investigate. I see now how you're getting around the bits that want to replay an execution - just skip them! (locally, I noticed another replay spot here that CrossHair dies on)

In the meantime, I have another issue ... that I probably should have foreseen. It isn't always safe to operate on symbolics outside the context manager. Integers like the one we tried will mostly work, but other things (strings especially) will blow up unless I have the appropriate interpreter intercepts installed. There are a few different ways that symbolics may outlive the test run:

The primitives that hypothesis will record in this tree structure that you're designing right now.
In an exception message or attribute.
In the return value of the test. (yes, tests should return None, but hypothesis may explode when trying to complain about the return value if it is symbolic)
(other places you can think of?)

Annnd, some potential solutions for each of the above:

It looks like the provider can already take the conjecture data. So I could peek in there when my context manager is about to exit and walk/replace the primitive tree with realized values.
I think hypothesis error handling happens outside my context manager, so I can probably realize the exception on its way out.
Ignore this for now? Alternatively, I could solve (2) & (3) by somehow wrapping the test function.

Thoughts?

Zac-HD · 2023-11-22T01:41:56Z

aha - that's exactly the replay spot I also identified, though I still need to wrap it in the context manager. We'll probably just have some more back-and-forth like this leading up to a coordinated release 🙂

Sounds like realizing on the way out of the context manager will solve (1) and (2); IMO simply not solving (3) is reasonable since it's a user error already, and if anything else comes up for (4) I'll post again here.

pschanely · 2023-12-03T00:34:06Z

Sounds good on not thinking about (3) yet. I took a little time to try and write something that would connect up with your provider-plugins branch, here. Ready to work on these in parallel!

In addition to the context managers, I'll need a way to share CrossHair's RootNode state in between runs for the same test, so maybe think about how you'd like that to happen. (maybe jam something onto the runner if I can access that or something?)

Zac-HD · 2023-12-03T03:41:50Z

In addition to the context managers, I'll need a way to share CrossHair's RootNode state in between runs for the same test, so maybe think about how you'd like that to happen. (maybe jam something onto the runner if I can access that or something?)

I've added this to the branch and opened a WIP pr - by having a global monkeypatchable function which returns a context manager (to wrap all runs for a test) which yields a function that returns a second context manager (to wrap each test case). Please forgive the other changes there, I'm in 'move fast and break things' mode while we iterate towards a workable design.

pschanely · 2023-12-20T22:05:56Z

Some small updates on my end:

Updated my intergration code to line up with this branch.
Updated the string generation to no longer pre-maturely realize the length of the string (using some hooks I put in CrossHair v0.0.46).

Zac-HD · 2023-12-22T09:58:22Z

TODO: detect whether this specific test is supposed to use the crosshair backend (somehow) and return nullcontext if it isn't. (link)

This seems like it calls for a real API design, since there might be multiple backends fighting over the hacky monkeypatchable thing😅

For now, checking whether hypothesis.settings.default.backend == "crosshair" should do the trick.

Zac-HD · 2024-03-10T05:33:52Z

Closing this in favor of #3914, since we've shipped an initial implementation ✨

Zac-HD added interop how to play nicely with other packages internals Stuff that only Hypothesis devs should ever see labels Sep 8, 2021

pschanely mentioned this issue Oct 8, 2021

Add support for checking hypothesis tests pschanely/CrossHair#45

Open

Zac-HD mentioned this issue Oct 25, 2021

Better heuristics for generating weird unicode strings #3127

Open

Zac-HD mentioned this issue Nov 3, 2021

Hypothesis tests are not deterministic pytorch/pytorch#25248

Closed

This was referenced Feb 1, 2023

Bypass small examples in stateful testing #3562

Closed

Construct and use a 'fuzzing dictionary' Zac-HD/hypofuzz#8

Open

Zac-HD mentioned this issue Oct 22, 2023

Change the core draw primitive to be selecting an integer from 0 to n #1986

Closed

tybug mentioned this issue Nov 12, 2023

Move float, integer, string, byte, and boolean drawing logic into PrimitiveProvider #3788

Merged

tybug mentioned this issue Nov 26, 2023

Support forcing all primitives #3801

Merged

Zac-HD mentioned this issue Dec 3, 2023

Add support for crosshair backend #3806

Merged

tybug mentioned this issue Dec 19, 2023

Migrate DataTree to the new IR #3818

Merged

5 tasks

pschanely mentioned this issue Mar 9, 2024

Heuristics for lengths of sequence-like symbolics pschanely/CrossHair#247

Open

Zac-HD mentioned this issue Mar 10, 2024

Stable support for symbolic execution #3914

Open

12 tasks

Zac-HD closed this as completed Mar 10, 2024

Zac-HD mentioned this issue Mar 14, 2024

migrate Float shrinker to the ir #3899

Merged

tybug mentioned this issue Mar 15, 2024

Migrate our core representation to an IR layer #3921

Open

8 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactoring `ConjectureData` to support symbolic execution #3086

Refactoring `ConjectureData` to support symbolic execution #3086

Zac-HD commented Sep 8, 2021 •

edited

Zac-HD commented Feb 21, 2022

pschanely commented Apr 28, 2022

Zac-HD commented Apr 28, 2022

pschanely commented Apr 29, 2022

Zac-HD commented Apr 29, 2022 •

edited

pschanely commented May 2, 2022

pschanely commented May 25, 2022

Zac-HD commented Oct 22, 2023

Zac-HD commented Nov 6, 2023 •

edited

Zac-HD commented Nov 17, 2023

Zac-HD commented Nov 20, 2023

tybug commented Nov 20, 2023

Zac-HD commented Nov 20, 2023

Zac-HD commented Nov 21, 2023

pschanely commented Nov 21, 2023 •

edited

Zac-HD commented Nov 22, 2023

pschanely commented Dec 3, 2023 •

edited

Zac-HD commented Dec 3, 2023

pschanely commented Dec 20, 2023

Zac-HD commented Dec 22, 2023

Zac-HD commented Mar 10, 2024

Refactoring ConjectureData to support symbolic execution #3086

Refactoring ConjectureData to support symbolic execution #3086

Comments

Zac-HD commented Sep 8, 2021 • edited

Zac-HD commented Feb 21, 2022

pschanely commented Apr 28, 2022

Zac-HD commented Apr 28, 2022

pschanely commented Apr 29, 2022

Zac-HD commented Apr 29, 2022 • edited

pschanely commented May 2, 2022

pschanely commented May 25, 2022

Zac-HD commented Oct 22, 2023

Zac-HD commented Nov 6, 2023 • edited

Zac-HD commented Nov 17, 2023

Zac-HD commented Nov 20, 2023

tybug commented Nov 20, 2023

Zac-HD commented Nov 20, 2023

Zac-HD commented Nov 21, 2023

pschanely commented Nov 21, 2023 • edited

Zac-HD commented Nov 22, 2023

pschanely commented Dec 3, 2023 • edited

Zac-HD commented Dec 3, 2023

pschanely commented Dec 20, 2023

Zac-HD commented Dec 22, 2023

Zac-HD commented Mar 10, 2024

Refactoring `ConjectureData` to support symbolic execution #3086

Refactoring `ConjectureData` to support symbolic execution #3086

Zac-HD commented Sep 8, 2021 •

edited

Zac-HD commented Apr 29, 2022 •

edited

Zac-HD commented Nov 6, 2023 •

edited

pschanely commented Nov 21, 2023 •

edited

pschanely commented Dec 3, 2023 •

edited