Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Explain how Length.run_step works #1608

Merged
merged 4 commits into from Oct 10, 2018

Conversation

Zalathar
Copy link
Contributor

This PR (hopefully) makes it easier to understand how Length.run_step works.

First it replaces j and i + 1 with more meaningful variable names, based on my interpretation of what's happening.

Then it adds an overall description of the shrinking process, step-by-step comments, and a diagram to put each variable and slice into context.


This is a stand-alone change that doesn't fix or change any behaviour, so there's no particular hurry to get it merged.

I recommend looking the variable-renaming commit on its own, because the combined diff makes it a bit harder to see that the new version is equivalent.

@DRMacIver
Copy link
Member

I definitely like making this clearer! I've got two small comments:

  • The name skipped isn't actually very obvious to me. It's certainly not one I would have chosen. Can you elaborate a little on why you chose that one?
  • I think the explanation misses a crucial detail: Many of these elements past the current index could be ones that we can now delete, because running this pass once doesn't necessarily reach a fixed point. For example imagine a case where you started from [0, 1, 2] and the the interesting sequences were that, [0, 2], and [0]. When the first pass deletes the 1 the 2 is now deletable despite being to the right of the current index

@Zalathar
Copy link
Contributor Author

Zalathar commented Oct 3, 2018

The idea is that we are repeatedly trying to delete find_integer items from the end of a list.

But before doing so, we first “skip over” all of the rightmost items that we have previously failed to delete during this pass, delete items from the end of whatever remains, and then append all of the “skipped” items after the deleted items before considering the new candidate list.

Reusing your example, we would eventually skip over the stubborn [2] and try to delete items from the end of [0, 1]. After deleting the 1 and appending the skipped 2, our next candidate sequence is [0, 2], which we submit for consideration.

Eventually we find that 0 is also stubborn, so our skipped region contains 0 and 2. That’s now the entire sequence, so the pass ends.

@Zalathar
Copy link
Contributor Author

Zalathar commented Oct 3, 2018

(I’m not 100% on the name, because it doesn’t quite capture that we tried to delete each stubborn element before giving up and deciding to skip it for the remainder of the pass.)

@Zalathar
Copy link
Contributor Author

Zalathar commented Oct 4, 2018

Another way of looking at it is that in the original code, i is a cursor that moves backwards through the sequence, and j is an offset from the end of the sequence that helps keep the cursor in the right place after elements are deleted.

In the renamed code, j becomes skipped, which is the length of the subsequence to the right of the cursor. And i+1 becomes remaining, which is the length of the subsequence to the left of the cursor (plus the cursor itself).

These two lengths sum to the total length of the current sequence. When we take the slice start[:remaining - k], it should contain all of the elements eligible for deletion, minus the k elements actually being deleted. And start[remaining:] should contain all the elements that weren't eligible for deletion, i.e. the skipped elements that we've previously failed to delete.

(Ideally that latter slice would be start[-skipped:], but that doesn't work when skipped == 0, which I learned the hard way.)

)
)
j += 1
# Skip over the element we couldn't delete, and continue.
# (If we deleted everything, this will put j > len(self.current),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whoops, this reference to j no longer makes sense now that I've renamed the variables.

@Zalathar

This comment has been minimized.

@Zalathar
Copy link
Contributor Author

Zalathar commented Oct 9, 2018

I tightened up some of the text, and moved the diagram into a “footnote” at the bottom of the file, so there's less of a wall-of-text effect when reading through the implementation from top to bottom.

I also changed remaining to candidates, indicating that it covers the elements that are still eligible candidates for deletion.

Copy link
Member

@Zac-HD Zac-HD left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fantastic explanation - LGTM.

@Zac-HD Zac-HD merged commit 1652781 into HypothesisWorks:master Oct 10, 2018
@Zalathar Zalathar deleted the length-shrinker branch October 10, 2018 10:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants