New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add next_array
and collect_array
#560
base: master
Are you sure you want to change the base?
Conversation
Note that this was already the case since 83c0f04 since it uses saturating_pow which was only stabilized in 1.34.
This also allows us to automatically detect support for min const generics.
A possible enhancement might be to return trait FromArray<T, const N: usize> {
fn from_array(array: [T; N]) -> Self;
}
impl<T, const N: usize> FromArray<T, N> for [T; N] { /* .. */ }
impl<T, const N: usize> FromArray<Option<T>, N> for Option<[T; N]> { /* .. */ }
impl<T, E, const N: usize> FromArray<Result<T, E>, N> for Result<[T; N], E> { /* .. */ } In fact, I think this is highly useful because it allows things like let ints = line.split_whitespace().map(|n| n.parse());
if let Ok([x, y, z]) = ints.collect_array() {
...
} This would be completely in line with |
So I have a working implementation of the above idea here: https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=9dba690b0dfc362971635e21647a4c19. It makes this compile: fn main() {
let line = "32 -12 24";
let nums = line.split_whitespace().map(|n| n.parse::<i32>());
if let Some(Ok([x, y, z])) = nums.collect_array() {
println!("x: {} y: {} z: {}", x, y, z);
}
} It would change the interface to: trait ArrayCollectible<T>: Sized {
fn array_from_iter<I: IntoIterator<Item = T>>(iterable: I) -> Option<Self>;
}
trait Itertools: Iterator {
fn collect_array<A>(self) -> Option<A>
where
Self: Sized,
A: ArrayCollectible<Self::Item>;
} where
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi there! Thanks for this. I particularly like that you thought about a way of enabling const
-generic stuff without raising the minimum required rust version (even if I would imagine something else due to having an aversion against depending on other crates too much).
There has been some discussion recently about basically supporting not only tuples, but also arrays. I just want to make sure that we do not loose input from these discussions when actually settling with your solution:
- implement arrays and next_array #549
- Array combinations #546
- Const generics iterator candidates #547
On top of that, I think there are some changes in there that are not directly related to this issue. If you'd like to have them merged, could you possibly factor them out into separate PRs/commits?
fn main() { | ||
let is_nightly = version_check::is_feature_flaggable() == Some(true); | ||
let is_at_least_1_34 = version_check::is_min_version("1.34.0").unwrap_or(false); | ||
let is_at_least_1_51 = version_check::is_min_version("1.51.0").unwrap_or(false); | ||
|
||
if !is_at_least_1_34 && !is_nightly { | ||
println!("cargo:warning=itertools requires rustc => 1.34.0"); | ||
} | ||
|
||
if is_at_least_1_51 || is_nightly { | ||
println!("cargo:rustc-cfg=has_min_const_generics"); | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Usually, I like the idea of having everything automated, but I am not sure if we should go with a build.rs
and an additional dependency. My first idea was to use a feature flag (that would probably be off by default) that the user can enable if desired.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My first idea was to use a feature flag (that would probably be off by default) that the user can enable if desired.
I think feature flags to enable things that are already available in the latest stable Rust and have no further compile time or dependency drawbacks makes no sense.
- You could end up with dozens of feature flags for minor features, or hold back progress because a feature that would otherwise be merged would now be too minor to accept due to introducing a new feature flag.
- You end up with useless features that you can't remove without a breaking change as your MSRV goes up.
- It's un-ergonomic as it adds an extra step for the user.
These drawbacks while it could be done completely and correctly automatically are unacceptable in my opinion. If you are hesitant regarding the version-check
dependency, I'd just like to note that it's tiny, has no further downstream dependencies, and is already relied on by crates such as time
, nom
, rocket
, fd-find
among others.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Philippe-Cholet, @phimuemue, perhaps it's time we updated our MSRV to 1.51 (which is two years old at this point).
While I don't mind the version-detection approach, I would like us to adopt it in tandem with changes to our CI that ensure we are testing on all detected versions. I'd also like to perhaps avoid taking the dependency on rust_version
. This would all be a substantial change, and outside the scope of this PR.
My vote is that we increase our MSRV. We can aways decrease it in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jswrenn I sure don't mind increasing the MSRV but I would suggest we release the 0.13.0 first, and then increase the MSRV in 0.14.0 to not require the build script (in which case orlp will have enough time to work on this).
//! This version of itertools requires Rust 1.32 or later. | ||
//! This version of itertools requires Rust 1.34 or later. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If your assessment is correct, we could possibly increment the minimum rust version in a separate commit?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean a pull request? It's already a separate commit.
match self.next_tuple() { | ||
elt @ Some(_) => match self.next() { | ||
Some(_) => None, | ||
None => elt, | ||
}, | ||
_ => None | ||
} | ||
self.next_tuple().filter(|_| self.next().is_none()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this really relevant to this PR? If not, could we separate it into another PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was mostly to be consistent with the other implementation. As it's just a stylistic change I don't think it's worth a pull request by itself to be honest.
@@ -0,0 +1,80 @@ | |||
use core::mem::MaybeUninit; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there was some discussion about building arrays:
@phimuemue Any update on this? |
I appreciate your effort, but unfortunately nothing substantial from my side: I changed my mind regarding |
@phimuemue Just for posterity's sake, |
@phimuemue Just checking in what the status is, I feel very strongly about the usefulness of |
Note that if you want I'll also mention |
This is a very useful feature. Today there was a thread on Reddit where the author basically asks if there's a crate that provides |
@Expurple |
I sometimes think about adding Another option I just saw: Crates can offer "nightly-only experimental API" (see https://docs.rs/arrayvec/latest/arrayvec/struct.ArrayVec.html#method.first_chunk for an example) - maybe this would help some users. I personally would lean towards |
I'm definitely not opposed to the idea but the EDIT: EDIT: Well I have some. With (My idea would be that @scottmcm Small discussion about temporarily adding |
For I can allocate some time to this next week. |
@jswrenn Please don't forget that we are discussing this on a PR that already has a working implementation without adding dependencies... |
fn drop(&mut self) { | ||
unsafe { | ||
// SAFETY: we only loop over the initialized portion. | ||
for el in &mut self.arr[..self.i] { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't need a loop here -- it's generally better to drop-in-place a whole slice rather than items individually.
// SAFETY: the take(N) guarantees we never go out of bounds. | ||
builder.push_unchecked(el); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure this is sound -- there might be a way for me to override take
(or one of the things it calls) in safe code such that this can return more than N things.
Maybe have it be something like
it.try_for_each(|x| builder.try_push(x));
with try_push
returning an Option
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a nasty one, I think you're right.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There should still be a take
in there though, when using try_for_each
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's right on the edge of soundness. There's no easy demo that I can come up with -- if you try to override take
you'll find that that doesn't actually work, for example, because you can't make something of the right type without unsafe
.
So it's possible that it's actually sound today, but there's so many nuances to that argument that I think it's probably better to consider it unsound. For example, if Rust one day added a way to "call super" -- which seems like an entirely plausible feature -- then it'd immediately be obviously-unsound as someone could implement take
as super.take(N+1)
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There should still be a
take
in there though, when usingtry_for_each
.
Oh, right, because otherwise you'll consume an extra element. Good catch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seconding @scottmcm's comment: For our MVP, does push_unchecked
really need to be _unchecked
?
@orlp, thanks, I had forgotten that this was a PR and not an issue when I made my reply. Still, we're talking about adding some extremely subtle unsafe code to Itertools. I'd like us to take extreme care to avoid accidentally introducing UB. A PR adding
If you can update this PR to do those things, I can see a path forward to merging it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this PR! I like the ArrayBuilder
abstraction quite a bit. As I mentioned, this will need additional documentation and testing before it can be merged. See the recent safety comments in my other project, zerocopy
for a sense of the paranoia rigor I'd like these safety comments to take.
/// Helper struct to build up an array element by element. | ||
struct ArrayBuilder<T, const N: usize> { | ||
arr: [MaybeUninit<T>; N], | ||
i: usize |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the safety invariant of i
with relation to arr
?
Self { arr: maybe_uninit::uninit_array(), i: 0 } | ||
} | ||
|
||
pub unsafe fn push_unchecked(&mut self, x: T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Needs a safety comment, in the format of:
pub unsafe fn push_unchecked(&mut self, x: T) { | |
/// Does XYZ. | |
/// | |
/// # Safety | |
/// | |
/// Callers promises that blah blah blah. | |
/// | |
/// # Panics | |
/// | |
/// This method does (or does not) panic. | |
pub unsafe fn push_unchecked(&mut self, x: T) { |
|
||
pub unsafe fn push_unchecked(&mut self, x: T) { | ||
debug_assert!(self.i < N); | ||
*self.arr.get_unchecked_mut(self.i) = MaybeUninit::new(x); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Needs a safety comment in the form:
*self.arr.get_unchecked_mut(self.i) = MaybeUninit::new(x); | |
// SAFETY: By contract on the caller, the safety condition on `get_unchecked_mut` that BLAH BLAH BLAH is satisfied. | |
*self.arr.get_unchecked_mut(self.i) = MaybeUninit::new(x); |
unsafe { | ||
// SAFETY: prevent double drop. | ||
self.i = 0; | ||
// SAFETY: [MaybeUninit<T>; N] and [T; N] have the same layout. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could this safety comment cite the standard library documentation? While it's true that these two types have the same size and alignment, it's not true that they have the same bit validity.
|
||
pub fn take(mut self) -> Option<[T; N]> { | ||
if self.i == N { | ||
unsafe { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you scope this unsafe { ... }
block down to just the ptr::read
?
self.i = 0; | ||
// SAFETY: [MaybeUninit<T>; N] and [T; N] have the same layout. | ||
let init_arr_ptr = &self.arr as *const _ as *const [T; N]; | ||
Some(core::ptr::read(init_arr_ptr)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs a SAFETY comment citing what the preconditions of ptr::read
are, and proving why they are satisfied.
unsafe { MaybeUninit::<[MaybeUninit<T>; N]>::uninit().assume_init() } | ||
} | ||
|
||
pub unsafe fn assume_init_drop<T>(u: &mut MaybeUninit<T>) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You don't need to replicate the entire stdlib doc comment here, but could you document the safety preconditions of assume_init_drop
?
// SAFETY: the take(N) guarantees we never go out of bounds. | ||
builder.push_unchecked(el); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seconding @scottmcm's comment: For our MVP, does push_unchecked
really need to be _unchecked
?
fn main() { | ||
let is_nightly = version_check::is_feature_flaggable() == Some(true); | ||
let is_at_least_1_34 = version_check::is_min_version("1.34.0").unwrap_or(false); | ||
let is_at_least_1_51 = version_check::is_min_version("1.51.0").unwrap_or(false); | ||
|
||
if !is_at_least_1_34 && !is_nightly { | ||
println!("cargo:warning=itertools requires rustc => 1.34.0"); | ||
} | ||
|
||
if is_at_least_1_51 || is_nightly { | ||
println!("cargo:rustc-cfg=has_min_const_generics"); | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Philippe-Cholet, @phimuemue, perhaps it's time we updated our MSRV to 1.51 (which is two years old at this point).
While I don't mind the version-detection approach, I would like us to adopt it in tandem with changes to our CI that ensure we are testing on all detected versions. I'd also like to perhaps avoid taking the dependency on rust_version
. This would all be a substantial change, and outside the scope of this PR.
My vote is that we increase our MSRV. We can aways decrease it in the future.
@jswrenn I will be busy the upcoming week but I'm willing to bring this up to standards after that. If before then you could decide on whether or not to bump the MSRV to 1.51 I could include that in the rewrite. |
With this pull request I add two new functions to the
Itertools
trait:These behave exactly like
next_tuple
andcollect_tuple
, however they return arrays instead. Since these functions requiremin_const_generics
, I added a tiny build script that checks if Rust's version is 1.51 or higher, and if yes to set thehas_min_const_generics
config variable. This means thatItertools
does not suddenly require 1.51 or higher, only these two functions do.In order to facilitate this I did have to bump the minimum required Rust version to 1.34 from the (documented) 1.32, since Rust 1.32 and 1.33 have trouble parsing the file even if stuff is conditionally compiled. However, this should not result in any (new) breakage, because
Itertools
actually already requires Rust 1.34 for 9+ months, since 83c0f04 usessaturating_pow
which wasn't stabilized until 1.34.As for rationale, I think these functions are useful, especially for pattern matching and parsing. I don't think there's a high probability they get added to the standard library either, so that's why I directly make a pull request here. When/if
TryFromIterator
stabilizes we can simplify the implementation, but even then I believe these functions remain a good addition similarly howcollect_vec
is nice to have despite.collect::<Vec<_>>
existing.