Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve performance of small tests that use rejection sampling #2030

Merged
merged 3 commits into from Jul 4, 2019
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
9 changes: 9 additions & 0 deletions hypothesis-python/RELEASE.rst
@@ -0,0 +1,9 @@
RELEASE_TYPE: patch

This release fixes :issue:`1864`, where some simple tests would perform very slowly,
because they would run many times with each subsequent run being progressively slower.
They will now stop after a more reasonable number of runs without hitting this problem.

Unless you are hitting exactly this issue, it is unlikely that this release will have any effect,
but certain classes of custom generators that are currently very slow may become a bit faster,
or start to trigger health check failures.
20 changes: 20 additions & 0 deletions hypothesis-python/src/hypothesis/internal/conjecture/data.py
Expand Up @@ -752,6 +752,7 @@ def __init__(self, max_length, draw_bytes, observer=None):
self.draw_times = []
self.max_depth = 0
self.has_discards = False
self.consecutive_discard_counts = []

self.__result = None

Expand Down Expand Up @@ -862,15 +863,34 @@ def start_example(self, label):
if self.depth > self.max_depth:
self.max_depth = self.depth
self.__example_record.start_example(label)
self.consecutive_discard_counts.append(0)

def stop_example(self, discard=False):
if self.frozen:
return
self.consecutive_discard_counts.pop()
if discard:
self.has_discards = True
self.depth -= 1
assert self.depth >= -1
self.__example_record.stop_example(discard)
if self.consecutive_discard_counts:
# We block long sequences of discards. This helps us avoid performance
# problems where there is rejection sampling. In particular tests which
# have a very small actual state space but use rejection sampling will
# play badly with generate_novel_prefix() in DataTree, and will end up
# generating very long tests with long runs of the rejection sample.
if discard:
self.consecutive_discard_counts[-1] += 1
# 20 is a fairly arbitrary limit chosen mostly so that all of the
# existing tests passed under it. Essentially no reasonable
# generation should hit this limit when running in purely random
# mode, but unreasonable generation is fairly widespread, and our
# manipulation of the bitstream can make it more likely.
if self.consecutive_discard_counts[-1] > 20:
self.mark_invalid()
else:
self.consecutive_discard_counts[-1] = 0

def note_event(self, event):
self.events.add(event)
Expand Down
13 changes: 13 additions & 0 deletions hypothesis-python/tests/cover/test_conjecture_test_data.py
Expand Up @@ -470,3 +470,16 @@ def test_example_equality():

assert not (ex == "hello")
assert ex != "hello"


def test_discarded_data_is_eventually_terminated():

data = ConjectureData.for_buffer(hbytes(100))

with pytest.raises(StopTest):
for _ in hrange(100):
data.start_example(1)
data.draw_bits(1)
data.stop_example(discard=True)

assert data.status == Status.INVALID
1 change: 0 additions & 1 deletion hypothesis-python/tests/cover/test_simple_characters.py
Expand Up @@ -130,7 +130,6 @@ def test_whitelisted_characters_override():
assert_no_examples(st, lambda c: c not in good_characters + "0123456789")


@pytest.mark.skip # temporary skip due to 560 second (!) perf regression; see #1864
def test_blacklisted_characters():
bad_chars = u"te02тест49st"
st = characters(
Expand Down