Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Efficient Hypothesis strategies #1503

Merged
merged 2 commits into from Feb 22, 2024

Conversation

Zac-HD
Copy link
Contributor

@Zac-HD Zac-HD commented Feb 21, 2024

This pull request fixes #404, which I opened a few years ago to fix some performance issues related to your rejection sampling, prompted by this stackoverflow question.

Recent Hypothesis versions can usually rewrite filters expressed as partial(operator.xxx, bound), and so this style is considerably more efficient in most cases. The only downside is that it can take a few minutes to get used to the partial() calls being "backwards", so lambda x: x < y becomes partial(op.gt, y) (via lambda x: y > x).

In the process, I also fixed two regex-related bugs where you'd see different behavior between the first and subsequent filters:

  • str_matches_strategy used fullmatch for the first, but match for subsequent filters, allowing generation of data with a disallowed suffix
  • for the first filter, str_startswith_strategy and str_endswith_strategy prepended/appended a regex boundary to the pattern. However, if the pattern includes alternation (e.g. a|b), this boundary would only be applied to the first/last branch, and thus invalid data could be generated. Placing the user's pattern inside a group resolves this problem.

Finally, I've updated the minimum Hypothesis version to that required for efficient length filtering, and included some regex expressions where the corresponding Hypothesis issue is currently open - so that they'll become efficient for your users as soon as we ship that.

Zac-HD and others added 2 commits February 22, 2024 12:00
Signed-off-by: Zac Hatfield-Dodds <zac.hatfield.dodds@gmail.com>
Signed-off-by: cosmicBboy <niels.bantilan@gmail.com>
Copy link

codecov bot commented Feb 22, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 94.29%. Comparing base (4df61da) to head (3df43d4).
Report is 15 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #1503   +/-   ##
=======================================
  Coverage   94.29%   94.29%           
=======================================
  Files          91       91           
  Lines        7024     7029    +5     
=======================================
+ Hits         6623     6628    +5     
  Misses        401      401           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Collaborator

@cosmicBboy cosmicBboy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @Zac-HD !

@cosmicBboy cosmicBboy merged commit 10cac40 into unionai-oss:main Feb 22, 2024
74 checks passed
@Zac-HD Zac-HD deleted the bugfix/hypothesis-strategies branch February 23, 2024 00:26
@Zac-HD
Copy link
Contributor Author

Zac-HD commented Feb 23, 2024

Woohoo! I wonder if we'll get user reports of their tests suddenly working much faster 😁

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Make Hypothesis strategies more efficient with statistics resolver and reducing use of .filter()
2 participants