Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for checking hypothesis tests #45

Open
pschanely opened this issue Jan 26, 2021 · 9 comments
Open

Add support for checking hypothesis tests #45

pschanely opened this issue Jan 26, 2021 · 9 comments

Comments

@pschanely
Copy link
Owner

You know what would be amazing? Symbolically executing tests written for Hypothesis! For example:

@given(st.integers(min_value=0))
def test(x):
    assert x ** 2 >= x

# could be equivalent to

def test(x: int):
    """pre: x >= 0"""
    assert x ** 2 >= x

this is particularly interesting to me because (a) I'm a core dev of Hypothesis 😜, (b) Hypothesis tests have to precisely describe the domain of valid inputs, and (c) they have a high density of assertions about error-prone behaviour!

I don't have time to write this myself (at least in the next few months), but would be delighted to help anyone who wants to have a go.

Originally posted by @Zac-HD in #21 (comment)

@Zac-HD
Copy link
Contributor

Zac-HD commented Jan 28, 2021

See also this set of notes on the topic; and FYI I have a prospective Honours student looking at working on this 😁

@Zac-HD
Copy link
Contributor

Zac-HD commented Mar 25, 2021

@JinghanCode is looking at this now, with the work-in-progress over at JinghanCode#1 🎉

Overall plan: start by translating a subset of Hypothesis strategies into CrossHair constraints, in order to get a baseline for the more general "choice sequence"-based approach (see notes, above). It's unclear at this stage whether we'd want to ship this as part of Hypothesis or CrossHair - it's pretty sensitive to unstable/internal implementation details, and there are many many tests it can't handle - but we do expect to learn something useful 😁

@pschanely
Copy link
Owner Author

This so cool. Let me know how I can help.
Starting out with it integrated on the CrossHair side makes sense to me, as it's far more experimental anyway. CrossHair will never detect 100% of cases, so even if we only handle basic cases, I'd be inclined to merge this kind of work :)

@Zac-HD
Copy link
Contributor

Zac-HD commented Mar 27, 2021

This so cool. Let me know how I can help.

Will do! That's likely to consist of me pinging you for code review + advice on how to evaluate it in a few weeks-months, once the basics are working.

Starting out with it integrated on the CrossHair side makes sense to me, as it's far more experimental anyway. CrossHair will never detect 100% of cases, so even if we only handle basic cases, I'd be inclined to merge this kind of work :)

My hesitation is because it's depending on the details of actually-for-real unstable implementation details of Hypothesis, which would make for a poor user experience over time. But I'm also keen to ship handling for the cases we can handle somewhere, and with your support a hypothesis[crosshair] extra could be pretty cool - right up to "we proved this always passes, skipping input generation" 😁

@pschanely
Copy link
Owner Author

That's likely to consist of me pinging you for code review + advice on how to evaluate it in a few weeks-months, once the basics are working.

Great; reach out anytime, even if it's just for some help along the way. I've also set up regular CrossHair discussion time that's open to the public, here, on calendly. (but don't feel constrained to those days/hours)

My hesitation is because it's depending on the details of actually-for-real unstable implementation details of Hypothesis, which would make for a poor user experience over time. But I'm also keen to ship handling for the cases we can handle somewhere, and with your support a hypothesis[crosshair] extra could be pretty cool - right up to "we proved this always passes, skipping input generation" 😁

Got it. Yeah, it would be amazing to see this as a hypothesis extra. FWIW, if we do end up deciding to put this on the CrossHair side, I'm happy to make sure CrossHair stays compatible.

It makes sense to think about this as an experiment for now, but eventually it might be worth thinking about structuring the bridge logic in a way that is imaginable as part of the strategy interface. For instance, CrossHair would be perfectly happy with an ability to query arbitrary strategies for (1) a (possibly parameterized) type and (2) a post-processing function for mapping/filtering. (though 'm sure my mental model of a strategy is overly naive!)

@pschanely
Copy link
Owner Author

@JinghanCode and @Zac-HD : how did this investigation go? Looks like you got somewhere with it, and I am very curious about your overall findings!

@Zac-HD
Copy link
Contributor

Zac-HD commented Jul 13, 2021

https://github.com/JinghanCode/Crosshair-Hypothesis/pull/1/files shows how far we got: based on some runtime introspection, we can analyse @given(...) tests with st.integers(...) - as well as one_of(), .map(), and .filter()!

The challenge for future work is to investigate how this scales out to cover more strategies (e.g. st.lists()), or to try plugging in at the lower-level conjecture (choice-sequence) interface as we discussed here.

@pschanely
Copy link
Owner Author

Note: a very basic and slow integration now exists (see this post about it)!
Try it out with crosshair watch --analysis_kind=hypothesis <file.py>

The next steps to make this more effective include some refactoring on the hypothesis side; this issue in particular.

@pschanely
Copy link
Owner Author

Now when CrossHair finds a hypothesis counterexample, it's saved to the hypothesis database. That means even regular runs of hypothesis will try inputs that CrossHair has found! (via #185; thank you @Zac-HD !!)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants