Skip to content

Commit

Permalink
Merge #465
Browse files Browse the repository at this point in the history
465: Add into_grouping_map for efficient group-and-fold operations r=jswrenn a=SkiFire13

Adds two functions on the `Itertools` trait, `into_grouping_map` and `into_grouping_map_by`. 

`into_grouping_map` expects an iter of `(K, V)` where the `K` will be used as key and `V` as value. 
`into_grouping_map_by` expect only an iter of `V` as values, the keys will be calculated using the provided functions. 

Both of them return a `GroupingMap`, which is just a wrapper on an iterator. Since it introduces a lot of related methods I thought it would be better to separate them from the `Itertools` trait. This also prevents duplicating every method for the `_by` version.

All of these functions have in common the fact they perform efficient group-and-fold operations without allocating temporary vecs like you would normally do if you used `into_group_map` + `into_iter` + `map` + `collect::<HashMap<_, _>>()`.

Here's the possible issues I can see, I would like to hear some feedback before trying to fix any of them:
- name: I initially though of `grouping_by` which is the java/kotlin equivalent, but later changed it to `into_grouping_map` to match the already existing `into_group_map` and to differentiate from `group_by`;
- `fold_first`: the equivalent function in the `Itertools` trait is `fold1` but there's an unstable stdlib function that does the same thing and is called `fold_first`. I decided to be consistent with the stdlib;
- `minmax` return type is the already existing `MinMaxResult`, but the `NoElements` variant is never returned. I didn't want to duplicate that struct thinking it could cause confusion (and if I did what name could I have chosen?);
- `sum` and `product`: They dont' use the `Sum` and `Product` traits but instead require `V: Add<V, Output=V>` and `V: Mul<V, Output=V>`. They're pretty much a wrapper around `fold_first`. I don't really know if they're meaningful or if I should just remove them;
- almost every closure takes as one of the parameters a reference to the key. It bloats them a bit, but could be useful if computing the key is a relatively expensive operation.
- no `scan` function. Even though it is an iterator adapter I could sort of "collect" it into a `Vec` (more like extend) but I don't really see an use for this;
- ~~no integration tests for the `_by` and `_by_key` versions of `min`, `max` and `minmax`. To be fair I was a bit lazy, but I also couldn't find any integration test for the normal `minmax_by` and `minmax_by_key` so I though it was fine;~~ added;
- no benchmark (I don't think it is required but probably would be cool to compare this with `into_group_map`;
- I didn't want to touch the rest of the library, but I guess we could implement `into_group_map` in terms of `into_grouping_map`?

Related issues: #457, #309
Related PR: #406 (in particular relates to the `into_group_map_by_fold` function that was proposed but later removed)

Co-authored-by: SkiFire13 <skifire13.1@gmail.com>
  • Loading branch information
bors[bot] and SkiFire13 committed Dec 20, 2020
2 parents 5ad3e1a + bb5f624 commit 8d27ab5
Show file tree
Hide file tree
Showing 3 changed files with 912 additions and 2 deletions.

0 comments on commit 8d27ab5

Please sign in to comment.