Explain how Length.run_step works #1608

Zalathar · 2018-09-29T12:04:42Z

This PR (hopefully) makes it easier to understand how Length.run_step works.

First it replaces j and i + 1 with more meaningful variable names, based on my interpretation of what's happening.

Then it adds an overall description of the shrinking process, step-by-step comments, and a diagram to put each variable and slice into context.

This is a stand-alone change that doesn't fix or change any behaviour, so there's no particular hurry to get it merged.

I recommend looking the variable-renaming commit on its own, because the combined diff makes it a bit harder to see that the new version is equivalent.

DRMacIver · 2018-10-03T15:03:55Z

I definitely like making this clearer! I've got two small comments:

The name skipped isn't actually very obvious to me. It's certainly not one I would have chosen. Can you elaborate a little on why you chose that one?
I think the explanation misses a crucial detail: Many of these elements past the current index could be ones that we can now delete, because running this pass once doesn't necessarily reach a fixed point. For example imagine a case where you started from [0, 1, 2] and the the interesting sequences were that, [0, 2], and [0]. When the first pass deletes the 1 the 2 is now deletable despite being to the right of the current index

Zalathar · 2018-10-03T23:00:25Z

The idea is that we are repeatedly trying to delete find_integer items from the end of a list.

But before doing so, we first “skip over” all of the rightmost items that we have previously failed to delete during this pass, delete items from the end of whatever remains, and then append all of the “skipped” items after the deleted items before considering the new candidate list.

Reusing your example, we would eventually skip over the stubborn [2] and try to delete items from the end of [0, 1]. After deleting the 1 and appending the skipped 2, our next candidate sequence is [0, 2], which we submit for consideration.

Eventually we find that 0 is also stubborn, so our skipped region contains 0 and 2. That’s now the entire sequence, so the pass ends.

Zalathar · 2018-10-03T23:05:19Z

(I’m not 100% on the name, because it doesn’t quite capture that we tried to delete each stubborn element before giving up and deciding to skip it for the remainder of the pass.)

Zalathar · 2018-10-04T03:45:50Z

Another way of looking at it is that in the original code, i is a cursor that moves backwards through the sequence, and j is an offset from the end of the sequence that helps keep the cursor in the right place after elements are deleted.

In the renamed code, j becomes skipped, which is the length of the subsequence to the right of the cursor. And i+1 becomes remaining, which is the length of the subsequence to the left of the cursor (plus the cursor itself).

These two lengths sum to the total length of the current sequence. When we take the slice start[:remaining - k], it should contain all of the elements eligible for deletion, minus the k elements actually being deleted. And start[remaining:] should contain all the elements that weren't eligible for deletion, i.e. the skipped elements that we've previously failed to delete.

(Ideally that latter slice would be start[-skipped:], but that doesn't work when skipped == 0, which I learned the hard way.)

Zalathar · 2018-10-09T08:52:21Z

hypothesis-python/src/hypothesis/internal/conjecture/shrinking/length.py

                )
            )
-            j += 1
+            # Skip over the element we couldn't delete, and continue.
+            # (If we deleted everything, this will put j > len(self.current),


Whoops, this reference to j no longer makes sense now that I've renamed the variables.

Zalathar · 2018-10-09T10:33:12Z

I tightened up some of the text, and moved the diagram into a “footnote” at the bottom of the file, so there's less of a wall-of-text effect when reading through the implementation from top to bottom.

I also changed remaining to candidates, indicating that it covers the elements that are still eligible candidates for deletion.

Zac-HD

Fantastic explanation - LGTM.

Zalathar force-pushed the length-shrinker branch from 6eba5e3 to 6f37801 Compare September 29, 2018 12:07

DRMacIver approved these changes Oct 9, 2018

View reviewed changes

Zalathar commented Oct 9, 2018

View reviewed changes

This comment has been minimized.

Sign in to view

Zalathar added 4 commits October 9, 2018 20:05

Test that the Length shrinker deletes all easily-deletable elements

ea63609

Use more meaningful variable names in Length.run_step

ca1de28

Explain how Length.run_step works

7aa3881

Add RELEASE.rst

acad9ca

Zalathar force-pushed the length-shrinker branch from 6f37801 to acad9ca Compare October 9, 2018 10:21

Zac-HD approved these changes Oct 10, 2018

View reviewed changes

Zac-HD merged commit 1652781 into HypothesisWorks:master Oct 10, 2018

Zalathar deleted the length-shrinker branch October 10, 2018 10:18

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Explain how Length.run_step works #1608

Explain how Length.run_step works #1608

Zalathar commented Sep 29, 2018

DRMacIver commented Oct 3, 2018

Zalathar commented Oct 3, 2018

Zalathar commented Oct 3, 2018

Zalathar commented Oct 4, 2018 •

edited

Zalathar Oct 9, 2018

This comment has been minimized.

Zalathar commented Oct 9, 2018

Zac-HD left a comment

Explain how Length.run_step works #1608

Explain how Length.run_step works #1608

Conversation

Zalathar commented Sep 29, 2018

DRMacIver commented Oct 3, 2018

Zalathar commented Oct 3, 2018

Zalathar commented Oct 3, 2018

Zalathar commented Oct 4, 2018 • edited

Zalathar Oct 9, 2018

Choose a reason for hiding this comment

This comment has been minimized.

Zalathar commented Oct 9, 2018

Zac-HD left a comment

Choose a reason for hiding this comment

Zalathar commented Oct 4, 2018 •

edited