-
Notifications
You must be signed in to change notification settings - Fork 269
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tail and some others can work better if they'll use a length hints (pep 424) #510
Comments
Please, note, that this is not a PR, as |
Thanks for the suggestion. I'm open to the "if it's a Sequence, slice it" optimization - you can make a PR for that? But because the "length hint" isn't necessarily true, I'd rather not rely on it. |
How about using |
Another approach is to wrap deque call in Anyway this can be a breaking change as previously data were effectively copied and original iterator was exhausted before the next item picked from the |
To be clear, the
|
if isinstance(iterable, collections.abc.Sequence):
return iter(iterable[-n:]) In this implementation, memory will be immediately allocated for all elements of the tail if we work with lists, strings or tuples. But maybe we're not going to use all the tail elements. For example, if then we plan to use I'm inclined to do this: def tail(n, iterable):
if isinstance(iterable, collections.abc.Sized):
length = len(iterable)
yield from islice(iterable, max(0, length - n), length)
else:
yield from deque(iterable, maxlen=n) In this case, we will get the laziest version of |
This is a good decision. I see people misimplement iterators with a
This is a great idea and will greatly improve performance for short tails of long sequences (which is a typical use case).
It is not clear to me that this is always (or even sometimes) better. The iterator protocol is split into two phases, iterator creation and iterator consumption, a setup phase and a take care of business phase. If the two steps are done together, creating the iterator and consuming it immediately, then lazy evaluation has no effect on the user experience. Code in the following form will see zero benefit:
If the two steps are not done together, a setup phase and later execution phase, it is reasonable to want to shift as much work as possible to the setup phase so that the execution phase is maximally responsive for the user. This is the same reason that we precompile regexes in templating engines — we want the pattern search to run as quickly as possible when requested. Likewise, lazy evaluation precludes timely closing of the underlying resource:
So, this would be a breaking change. The mitigation would involve |
One other thought: The
|
I understand, that recipes were ported directly from the docs, but things like tail do not perform great in some situations.
Note, that when
deque
is constructed it consumes almost everything from the iterator, so creation oftail
iterator is not lazy.A better possible implementation is:
This will make things like
really fast, things like
somewhat faster (but at least lazy).
And hopefully will "automagically" improve in the future for some other cases like
enumerate
andmap
.The text was updated successfully, but these errors were encountered: