Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

More generic maps in GroupingMap (v2) #906

Draft
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

Philippe-Cholet
Copy link
Member

@Philippe-Cholet Philippe-Cholet commented Mar 19, 2024

This an alternative to #901 that I learnt from, it would close #901, close #588 and eventually solve most generic-container Generic vector/hasher/map issues.

  • I reuse Map polished earlier.
  • Transform GroupingMap[By] into GroupingGenericMap[By]
    This is a huge but quite straightforward transformation:
    • GroupingMap[By] becomes GroupingGenericMap[By] with a generic type M
    • K: Hash becomes M: Map<Key = K>
    • HashMap<K, ...> becomes M with a M: Map<Value = ...> bound
    • Remove Hash and HashMap
    • use_std becomes use_alloc
  • Reintroduce GroupingMap[By] with one more generic parameter
    This is a breaking change! However, this is mostly inferred.
    Note that it breaks (only) laziness tests because the GroupingMap object is not used (I could write any type but u8 is one of the possible returned types). If it was used then it would have been inferred.
    I copied/pasted the code of old into_grouping_map[_by] methods, added the generic R and updated each function body with _in(.., HashMap::new()).
    The old types are aliases of the new ones.

The main interest of this PR over #901 is that there is not _in versions for each GroupingMap method.


More than the additional parameter, the following code (weird but currently working) would not compile anymore.

let g = (0..10).into_grouping_map_by(|n| n % 2); // the map would now lives here so the values type is fixed earlier.
if true {
    println!("{:?}", g.collect::<Vec<_>>()); // HashMap<i32, Vec<i32>>
} else {
    println!("{:?}", g.sum());               // HashMap<i32, i32>
}

However, it's easily solvable here, .into_grouping_map_by(|n| n % 2) should be moved to both branches for the code to work as before.

Now, in other cases, it might be not that easily solvable! Is it a too big breaking change?

Note: `BTreeMap::remove<Q>` and `BTreeMap::remove<Q>` both accept a more general `Q` rather than just `Self::Key`.
If ever needed, `Map<Q>` would be possible (with the same bound on `Q` than `Self::Key`: `Ord` or `Eq + Hash`).

Implement `Map` for simple unordered `Vec<(_, _)>`. Note it uses `swap_remove`.
This is a huge but quite straightforward transformation:
- `GroupingMap[By]` becomes `GroupingGenericMap[By]` with a generic type `M`
- `K: Hash` becomes `M: Map<Key = K>`
- `HashMap<K, ...>` becomes `M` with a `M: Map<Value = ...>` bound
- Remove `Hash` and `HashMap`
- `use_std` becomes `use_alloc`
This is a breaking change! However, this is mostly inferred.
Note that it breaks (only) laziness tests because the `GroupingMap` object is not used (I could write any type but `u8` is one of the possible returned types). If it was used then it would have been inferred.

I copied/pasted the code of old `into_grouping_map[_by]` methods, added the generic `R` and updated each function body with `_in(.., HashMap::new())`.
The old types are aliases of the new ones.
@Philippe-Cholet Philippe-Cholet changed the title More generic maps in GroupingMap More generic maps in GroupingMap (v2) Mar 19, 2024
rust-itertools#901 (comment)
SkiFire13 had the idea this could be optimized for vectors.
Copy link

codecov bot commented Mar 21, 2024

Codecov Report

Attention: Patch coverage is 66.66667% with 44 lines in your changes are missing coverage. Please review.

Project coverage is 94.00%. Comparing base (6814180) to head (2203cb6).
Report is 41 commits behind head on master.

Files Patch % Lines
src/generic_containers.rs 33.33% 42 Missing ⚠️
src/grouping_map.rs 95.34% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master     #906      +/-   ##
==========================================
- Coverage   94.38%   94.00%   -0.39%     
==========================================
  Files          48       49       +1     
  Lines        6665     6905     +240     
==========================================
+ Hits         6291     6491     +200     
- Misses        374      414      +40     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@Philippe-Cholet
Copy link
Member Author

To generalize the "quick tests", I would need to generalize into_group_map as well.
To avoid a very huge PR, I think that I should implement Map only for HashMap first then BTreeMap and only then Vec in following PRs.

@SkiFire13 I was able to make it work with a single additional generic parameter.

@phimuemue @jswrenn Do you think such breaking change is acceptable? I worry it might be too nasty for very few people.
I therefore searched on github and eventually found only one place this would break: they would be able to easily fix this by moving .into_grouping_map() into their two use cases.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking-change generic-container Generic vector/hasher/map
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Is there any way to make the into_grouping_map family of functions support custom hashers
1 participant