Massive performance drop in test_blacklisted_characters #1864

Zalathar · 2019-03-12T14:33:36Z

While testing some other changes, I stumbled across the fact that test_blacklisted_characters now takes a very long time to run.

This can be seen in https://travis-ci.org/HypothesisWorks/hypothesis/jobs/504749908, and seems to be linked to #1846 in some way.

The text was updated successfully, but these errors were encountered:

Zalathar · 2019-03-12T14:34:10Z

Sadly I haven't had a chance to track this down more precisely, or to figure out whether it's a real bug or a test bug.

See issue #1864; but I'm simply disabling it for now due to eb huge slowdown.

Zalathar · 2019-03-13T09:21:56Z

I've determined that the slowdown is in the assert_no_examples part of the test.

Looking through the --hypothesis-verbosity=debug -s output, it is doing more-or-less what it was doing before: It quickly finds all of the unique output values, and then spends the rest of its exploration budget generating increasingly-elaborate rejection-sampler sequences.

The difference is that instead of finishing in an instant, this process becomes slower and slower (inconsistently) as the test continues to run.

Zalathar · 2019-03-13T09:26:30Z

I traced the problem to generate_novel_prefix in DataTree:

hypothesis/hypothesis-python/src/hypothesis/internal/conjecture/datatree.py

Lines 256 to 284 in 73e41c3

    
               def generate_novel_prefix(self, random): 
        
                   """Generate a short random string that (after rewriting) is not 
        
                   a prefix of any buffer previously added to the tree. 
        
                   This is logically equivalent to generating the test case uniformly 
        
                   at random and returning the first point at which we hit unknown 
        
                   territory, but with an optimisation for the only common case where 
        
                   that would be inefficient. 
        
                   """ 
        
                   assert not self.is_exhausted 
        
                   initial = self.find_necessary_prefix_for_novelty() 
        
                   while True: 
        
                       def draw_bytes(data, n): 
        
                           i = data.index 
        
                           if i < len(initial): 
        
                               return initial[i : i + n] 
        
                           else: 
        
                               return uniform(random, n) 
        
                       data = ConjectureData(draw_bytes=draw_bytes, max_length=float("inf")) 
        
                       try: 
        
                           self.simulate_test_function(data) 
        
                       except PreviouslyUnseenBehaviour: 
        
                           return hbytes(data.buffer)

As the test continues to run, the while True loop takes more and more iterations to complete on average. It often requires hundreds or thousands of retries before it can find a novel prefix by chance.

The underlying cause is that this novelty-generator has no way to detect and avoid exhausted parts of the tree. So when novel prefixes are rare, it spins over and over until it gets lucky enough to stumble upon one.

Zalathar · 2019-03-13T11:40:17Z

This effect isn't specific to characters. I can reproduce it with this:

def test_integers():
    assert_no_examples(st.integers(0, 5), lambda x: False)

Range sizes that are (2 ** n) - 2 are the most severely affected.

DRMacIver · 2019-03-13T12:58:11Z

The underlying cause is that this novelty-generator has no way to detect and avoid exhausted parts of the tree. So when novel prefixes are rare, it spins over and over until it gets lucky enough to stumble upon one.

This isn't quite true. Note that it detects exhausted parts of the tree up until the point where the first possible branch occurs. This means that when there is a long forced prefix it finds that without spinning. I had (apparently mistakenly) assumed that that would be enough and that the cases where there were multiple possible branches to take there but we still got high rejection rates would be relatively uncommon.

I think a fix that is probably sufficient for this case and is equivalent in result to the current is to always have the first non-forced block chosen uniformly at random from the set of available possibilities. Maybe something more sophisticated is needed in the general case.

Zac-HD · 2019-04-19T11:47:33Z

@pytest.mark.skip was added in 354258d so this isn't killing our CI, but the underlying problem is still there.

Zalathar added tests/build/CI about testing or deployment *of* Hypothesis performance go faster! use less memory! labels Mar 12, 2019

Zac-HD added a commit that referenced this issue Mar 13, 2019

Skip super-slow test for now

354258d

See issue #1864; but I'm simply disabling it for now due to eb huge slowdown.

Zac-HD removed the tests/build/CI about testing or deployment *of* Hypothesis label Apr 19, 2019

DRMacIver mentioned this issue May 16, 2019

floating_dtypes(sizes=(16, 32, 64)) spins endlessly without drawing a value #1982

Closed

Zac-HD mentioned this issue Jul 2, 2019

Avoid rejection sampling in cu.integer_range #2029

Closed

This was referenced Jul 2, 2019

Improve performance of small tests that use rejection sampling #2030

Merged

Simple test never stop? #2027

Closed

DRMacIver closed this as completed in #2030 Jul 4, 2019

Zac-HD mentioned this issue Sep 8, 2019

Too many generated examples on odd number of arguments to "one_of" #2087

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Massive performance drop in test_blacklisted_characters #1864

Massive performance drop in test_blacklisted_characters #1864

Zalathar commented Mar 12, 2019

Zalathar commented Mar 12, 2019

Zalathar commented Mar 13, 2019

Zalathar commented Mar 13, 2019

Zalathar commented Mar 13, 2019

DRMacIver commented Mar 13, 2019

Zac-HD commented Apr 19, 2019

Massive performance drop in test_blacklisted_characters #1864

Massive performance drop in test_blacklisted_characters #1864

Comments

Zalathar commented Mar 12, 2019

Zalathar commented Mar 12, 2019

Zalathar commented Mar 13, 2019

Zalathar commented Mar 13, 2019

Zalathar commented Mar 13, 2019

DRMacIver commented Mar 13, 2019

Zac-HD commented Apr 19, 2019