Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generalize the various chunk methods to produce any collection that implements Default + Extend #2498

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

khuey
Copy link
Contributor

@khuey khuey commented Sep 15, 2021

Fixes #2492

@khuey khuey requested a review from taiki-e as a code owner September 15, 2021 18:39
@qm3ster
Copy link

qm3ster commented Sep 20, 2021

If a _Set or _Map is used as the collection, duplicate keys overwrite each other, and the len-like of the actual data structure doesn't increase.
Now that not only Vec can be used, we should examine what we mean when we want to set a cap/capacity for the adapter.
Both "how many items to consume" and "how many items to yield" are valid options.
The current implementation does the former.
The latter is missing a trait afaik, but may be less surprising.
Or more surprising, such as causing a soft deadlock with chunks(N) by not doing something necessary for more elements to be produced until a chunk is yielded because there were duplicate keys. (Where each "poke" of the source produces exactly N items.) (Could be fixed by the user with .take(N).chunks(N), but at that point they might as well collect because it won't be reusable.) 🐴

@qm3ster
Copy link

qm3ster commented Sep 20, 2021

Also, since there is no with_capacity trait, should this be earmarked for when https://doc.rust-lang.org/std/iter/trait.Extend.html#method.extend_reserve is stabilized?

@khuey
Copy link
Contributor Author

khuey commented Sep 20, 2021

Only the former is implementable today. The latter would require a (stablized) trait for collection lengths. Since the caller already has to cope with undersized collections (at least for the final item with chunks, and potentially much more often for ready_chunks or try_chunks) I don't think that's a big problem.

I don't think there's any need to gate this on extend_reserve being stabilized because the implementation can be changed to use that some day with no externally visible changes.

@qm3ster
Copy link

qm3ster commented Sep 21, 2021

Regarding extend_reserve, that's what I meant.

Perhaps something like "Depending on collection type, collections of less than capacity items may be yielded, as long as capacity elements were consumed." (but better worded) with or without an example in the doc comment. Probably only chunks needs this.

@khuey
Copy link
Contributor Author

khuey commented Sep 21, 2021

Yes I think we would document that N items are consumed and if your collection's Extend impl can result in replacement rather than addition (like *Set/Map can) then the yielded collection will have fewer than N items.

If you do want collections of N items no matter what you could implement chunks yourself with scan. It's a bit yucky but it's doable. e.g.

my_stream.map(Some).chain(None).scan(HashMap::new(), |map, item| {
    match item {
        // End of the stream, yield whatever partial collection is left.
        None => Some(Some(mem::take(map))),
        Some(item) => {
            map.extend(item);
            if map.len() >= DESIRED_N {
                return Some(Some(mem::take(map)));
            }
            Some(None)
        }
     }
}).filter_map(|x| x)

@taiki-e taiki-e added the S-needs-decision Status: A decision on whether or not to do this is needed. label Jul 19, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking change S-needs-decision Status: A decision on whether or not to do this is needed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Generalize chunks/ready_chunks/try_chunks/etc to collect into T: Extend, not just Vec
3 participants