Add support for checking hypothesis tests #45

pschanely · 2021-01-26T19:21:18Z

You know what would be amazing? Symbolically executing tests written for Hypothesis! For example:

@given(st.integers(min_value=0))
def test(x):
    assert x ** 2 >= x

# could be equivalent to

def test(x: int):
    """pre: x >= 0"""
    assert x ** 2 >= x

this is particularly interesting to me because (a) I'm a core dev of Hypothesis 😜, (b) Hypothesis tests have to precisely describe the domain of valid inputs, and (c) they have a high density of assertions about error-prone behaviour!

I don't have time to write this myself (at least in the next few months), but would be delighted to help anyone who wants to have a go.

Originally posted by @Zac-HD in #21 (comment)

Zac-HD · 2021-01-28T02:27:05Z

See also this set of notes on the topic; and FYI I have a prospective Honours student looking at working on this 😁

Zac-HD · 2021-03-25T03:12:03Z

@JinghanCode is looking at this now, with the work-in-progress over at JinghanCode#1 🎉

Overall plan: start by translating a subset of Hypothesis strategies into CrossHair constraints, in order to get a baseline for the more general "choice sequence"-based approach (see notes, above). It's unclear at this stage whether we'd want to ship this as part of Hypothesis or CrossHair - it's pretty sensitive to unstable/internal implementation details, and there are many many tests it can't handle - but we do expect to learn something useful 😁

pschanely · 2021-03-26T15:16:55Z

This so cool. Let me know how I can help.
Starting out with it integrated on the CrossHair side makes sense to me, as it's far more experimental anyway. CrossHair will never detect 100% of cases, so even if we only handle basic cases, I'd be inclined to merge this kind of work :)

Zac-HD · 2021-03-27T04:23:53Z

This so cool. Let me know how I can help.

Will do! That's likely to consist of me pinging you for code review + advice on how to evaluate it in a few weeks-months, once the basics are working.

Starting out with it integrated on the CrossHair side makes sense to me, as it's far more experimental anyway. CrossHair will never detect 100% of cases, so even if we only handle basic cases, I'd be inclined to merge this kind of work :)

My hesitation is because it's depending on the details of actually-for-real unstable implementation details of Hypothesis, which would make for a poor user experience over time. But I'm also keen to ship handling for the cases we can handle somewhere, and with your support a hypothesis[crosshair] extra could be pretty cool - right up to "we proved this always passes, skipping input generation" 😁

pschanely · 2021-03-30T11:06:06Z

That's likely to consist of me pinging you for code review + advice on how to evaluate it in a few weeks-months, once the basics are working.

Great; reach out anytime, even if it's just for some help along the way. I've also set up regular CrossHair discussion time that's open to the public, here, on calendly. (but don't feel constrained to those days/hours)

My hesitation is because it's depending on the details of actually-for-real unstable implementation details of Hypothesis, which would make for a poor user experience over time. But I'm also keen to ship handling for the cases we can handle somewhere, and with your support a hypothesis[crosshair] extra could be pretty cool - right up to "we proved this always passes, skipping input generation" 😁

Got it. Yeah, it would be amazing to see this as a hypothesis extra. FWIW, if we do end up deciding to put this on the CrossHair side, I'm happy to make sure CrossHair stays compatible.

It makes sense to think about this as an experiment for now, but eventually it might be worth thinking about structuring the bridge logic in a way that is imaginable as part of the strategy interface. For instance, CrossHair would be perfectly happy with an ability to query arbitrary strategies for (1) a (possibly parameterized) type and (2) a post-processing function for mapping/filtering. (though 'm sure my mental model of a strategy is overly naive!)

pschanely · 2021-07-12T17:14:05Z

@JinghanCode and @Zac-HD : how did this investigation go? Looks like you got somewhere with it, and I am very curious about your overall findings!

Zac-HD · 2021-07-13T04:09:38Z

https://github.com/JinghanCode/Crosshair-Hypothesis/pull/1/files shows how far we got: based on some runtime introspection, we can analyse @given(...) tests with st.integers(...) - as well as one_of(), .map(), and .filter()!

The challenge for future work is to investigate how this scales out to cover more strategies (e.g. st.lists()), or to try plugging in at the lower-level conjecture (choice-sequence) interface as we discussed here.

pschanely · 2021-10-08T14:44:27Z

Note: a very basic and slow integration now exists (see this post about it)!
Try it out with crosshair watch --analysis_kind=hypothesis <file.py>

The next steps to make this more effective include some refactoring on the hypothesis side; this issue in particular.

pschanely · 2022-11-13T22:09:43Z

Now when CrossHair finds a hypothesis counterexample, it's saved to the hypothesis database. That means even regular runs of hypothesis will try inputs that CrossHair has found! (via #185; thank you @Zac-HD !!)

Zac-HD mentioned this issue Mar 25, 2021

Change the core draw primitive to be selecting an integer from 0 to n HypothesisWorks/hypothesis#1986

Closed

pschanely added enhancement Hacktoberfest help wanted labels Oct 12, 2021

pschanely removed the Hacktoberfest label Oct 4, 2022

Zac-HD mentioned this issue Nov 13, 2022

Save Hypothesis buffer to database #185

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for checking hypothesis tests #45

Add support for checking hypothesis tests #45

pschanely commented Jan 26, 2021

Zac-HD commented Jan 28, 2021

Zac-HD commented Mar 25, 2021

pschanely commented Mar 26, 2021

Zac-HD commented Mar 27, 2021

pschanely commented Mar 30, 2021

pschanely commented Jul 12, 2021

Zac-HD commented Jul 13, 2021

pschanely commented Oct 8, 2021

pschanely commented Nov 13, 2022

Add support for checking hypothesis tests #45

Add support for checking hypothesis tests #45

Comments

pschanely commented Jan 26, 2021

Zac-HD commented Jan 28, 2021

Zac-HD commented Mar 25, 2021

pschanely commented Mar 26, 2021

Zac-HD commented Mar 27, 2021

pschanely commented Mar 30, 2021

pschanely commented Jul 12, 2021

Zac-HD commented Jul 13, 2021

pschanely commented Oct 8, 2021

pschanely commented Nov 13, 2022