New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FailedHealthCheck and slow performance with just 400 exampes of builtin strategies #2641
Comments
Can you share a complete reproducing example (i.e. include the test function), and run FWIW I agree that "hundreds of elements" is not a large dataset, but it's on the large side for Hypothesis to generate, since we're working with about 8K of entropy and might be doing a lot of work to find a minimal example if the test ever fails. I think my advice will end up being to consider the tradeoff between few large/slow examples and many small/faster examples; which would find most bugs per minute of runtime? If the large ones, disabling the healthcheck is probably the way to go 😕 |
What exactly from The test is the simplest for a list of a RESTful resource: class TestProfileList(object):
"""Test reading the profile list."""
@given(
profile_data_lists,
)
def test_ok(
self,
api_client,
profile_data_list_example,
):
"""Test ok path."""
with mock_v0_response(profile_data_list_example):
res = api_client.get(reverse('api_v1:profile-list'))
assert res.status_code == status.HTTP_200_OK, res.data Thank you for the general trade-off idea. In my case smaller sizes should work better. I wanted to have long data testing "by the way". :) |
Reduced lists max_size to 50, still
Along with the error statistics flag changed nothing:
I'll try to reduce size/examples slightly to get the statistics out. |
Reduced max_size to 30, here are the stats:
|
The last interesting observation to mention is non-linear growth of the running time.
+25% examples vs +186% time
+20% examples vs +261% time |
Well, things've got worse. I don't know if I should create a new issue for this, but the source data and test are the same. Eventually I went down to 20 max_size x 20 max_examples. Even that spawned "data too large" from time to time.
But that wasn't the end! Now it complaints:
@Zac-HD , why does hypothesis decide for developer, which example size is too big? Should we add a property to set it per project? Or maybe cancel max size filter when I'm feeling kinda frustrated after I suppressed the data size limit and hypothesis would say: "OK, now I'll still become slower for your big data, anyway!". |
Having looked into this again, I don't think there's much that we can do, sorry 😥 The basic problem arises from the combination of a few facts:
So I'd recommend suppressing I wish I had a better solution, but here we are 😕 |
We're doing an integration on a pipeline which requires that the input dataset must be a minimum size. Since this integration test is rather slow, we are happy testing it on a single sample (we also unit test separately): @given(df=pandera_schema.strategy(size=9)
@settings(
max_examples=1,
deadline=3000, # 3s in ms
suppress_health_check=[
hypothesis.HealthCheck.function_scoped_fixture,
hypothesis.HealthCheck.large_base_example,
hypothesis.HealthCheck.data_too_large])
def long_test(df): ... As you can see, we've disabled the We are disabling this maximum size check on our (tiny, IMO) dataframes by doing something like In short, I agree with @theoden-dd that the disabling the |
Unfortunately the error is telling you that Hypothesis literally ran out of random bytes to parse into your dataframe, and there's not really anything that we can do about that. I suspect that in your case the underlying problem is actually an ineffiencient implementation in I suggest changing the pipeline to allow fewer rows and/or columns, or opening an issue on Pandera about this performance problem. FWIW while this is too niche for us to fix for free, we also do consulting contracts if this is actually important to your company (and e.g. Stripe was very happy last time we worked with dataframes). |
Ah! If pandera is using rejection sampling then this problem makes total sense. |
Exactly 😁 "Improve pandera performance" is definitely on my todo list (albeit a long way from the top), and #2701 should also help. |
Thank you a lot for an excellent property-based testing library 🙏 Is there any way to bind Context:
I decide to write the message here instead of opening a new issue because I think there are any worth reasons to haven't such bindings. |
Maybe the answer is simply "Disable health check then" as for #2195. Just to report a performance reproducer.
I've got a simple profile strategy:
where
ImageFileType
andIdolStatus
are some enums.With 10 and 100 examples it works just fine.
But starting from 400 it generates errors like:
I wonder why 100 elements in a list became a "large data set".
As a workaround I reduced lists
max_size
to 50, still a single test with it takes up to 80-90 seconds.The test is simple, the lag is all about this data generation.
E.g. with 10 examples it takes 1.45 seconds.
The text was updated successfully, but these errors were encountered: