Fix a swarm-testing footgun #3894

Zac-HD · 2024-02-23T08:46:21Z

So, you know how we used swarm testing to choose a random subset of rules to enable on a RuleBasedStateMachine? Turns out that we could choose the empty set, which promptly fails the test because it's impossible to choose a rule from the empty set. Oops? This was still pretty rare even with a single-rule machine, but also a totally self-inflicted problem.

Also improves some reprs so that the status_reason in observability output looks nicer for stateful testing, closing #3845.

Also also fixes #3892, without docs because it's just improving the error message for something that already didn't work.

tybug · 2024-02-24T04:23:36Z

Following patch reproduces (some of?) the ci failures consistently, e.g. -k test_settings_decorator_applies_to_rule_based_state_machine_class:

diff --git a/hypothesis-python/src/hypothesis/strategies/_internal/featureflags.py b/hypothesis-python/src/hypothesis/strategies/_internal/featureflags.py
index 1e321e744..4b1e90e5b 100644
--- a/hypothesis-python/src/hypothesis/strategies/_internal/featureflags.py
+++ b/hypothesis-python/src/hypothesis/strategies/_internal/featureflags.py
@@ -53,6 +53,7 @@ class FeatureFlags:
         # of more features being enabled.
         if self.__data is not None:
             self.__p_disabled = data.draw_integer(0, 255) / 255.0
+            self.__p_disabled = 1
         else:
             # If data is None we're in example mode so all that matters is the
             # enabled/disabled lists above. We set this up so that everything

since we're forcing false but drawing true with probability 1. Possibly __p_disabled should be 254 / 255?

tybug · 2024-02-24T04:47:45Z

Found a weird one while debugging this. If your stateful test sets a step_count attribute, things can flake / go haywire, because hypothesis increments that in _repr_step while incorrectly assuming it's an internal attribute. I think this is an unused vestige of an old implementation, so I've just removed it.

reproducer

from hypothesis.stateful import RuleBasedStateMachine, invariant, rule

class NumberModifier(RuleBasedStateMachine):
    step_count = 0

    @rule()
    def count_step(self):
        self.step_count += 1

    @invariant()
    def divide_with_one(self):
        assert self.step_count % 2 == 0

TestCase = NumberModifier.TestCase

import unittest
unittest.main()

Zac-HD · 2024-02-24T05:10:42Z

Nice, I accidentally force-pushed over your commit (wish I could alias -f to --force-with-lease...) but that was very easy to fix from your description.

https://github.com/HypothesisWorks/hypothesis/actions/runs/8028323348/job/21933383800?pr=3894#step:7:98 is unrelated, but looks like it might be an IR flake? I think I've seen similar things one or twice before.

tybug · 2024-02-24T05:17:20Z

Interesting, I'll take a look. May turn out to need the same treatment as test_can_generate_hard_floats of "assert expected value, not expected buffer".

Zac-HD added the performance go faster! use less memory! label Feb 23, 2024

Zac-HD requested a review from tybug February 23, 2024 08:46

Zac-HD force-pushed the efficient-stateful branch 2 times, most recently from 0d72399 to 16d0ee7 Compare February 24, 2024 04:50

Zac-HD added 5 commits February 23, 2024 20:51

Code movement for a later PR

367a98f

Better error for too-large min_size

6a73e7e

Avoid disabling every rule

3ae6e44

Better status_reason obs on abort

efd38d0

Remove .step_count

77f596f

Zac-HD force-pushed the efficient-stateful branch from 16d0ee7 to 77f596f Compare February 24, 2024 05:14

Zac-HD merged commit 202d6af into HypothesisWorks:master Feb 24, 2024
48 checks passed

Zac-HD deleted the efficient-stateful branch February 24, 2024 05:56

Zac-HD mentioned this pull request Mar 12, 2024

RuleBasedStateMachine is prone to Unsatisfiable errors #3618

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix a swarm-testing footgun #3894

Fix a swarm-testing footgun #3894

Zac-HD commented Feb 23, 2024 •

edited

tybug commented Feb 24, 2024

tybug commented Feb 24, 2024 •

edited

Zac-HD commented Feb 24, 2024

tybug commented Feb 24, 2024

Fix a swarm-testing footgun #3894

Fix a swarm-testing footgun #3894

Conversation

Zac-HD commented Feb 23, 2024 • edited

tybug commented Feb 24, 2024

tybug commented Feb 24, 2024 • edited

Zac-HD commented Feb 24, 2024

tybug commented Feb 24, 2024

Zac-HD commented Feb 23, 2024 •

edited

tybug commented Feb 24, 2024 •

edited