Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add next_array and collect_array #560

Open
wants to merge 4 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
3 changes: 3 additions & 0 deletions Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,9 @@ test = false
[dependencies]
either = { version = "1.0", default-features = false }

[build-dependencies]
version_check = "0.9"

[dev-dependencies]
rand = "0.7"
criterion = "=0" # TODO how could this work with our minimum supported Rust version?
Expand Down
13 changes: 13 additions & 0 deletions build.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
fn main() {
let is_nightly = version_check::is_feature_flaggable() == Some(true);
let is_at_least_1_34 = version_check::is_min_version("1.34.0").unwrap_or(false);
let is_at_least_1_51 = version_check::is_min_version("1.51.0").unwrap_or(false);

if !is_at_least_1_34 && !is_nightly {
println!("cargo:warning=itertools requires rustc => 1.34.0");
}

if is_at_least_1_51 || is_nightly {
println!("cargo:rustc-cfg=has_min_const_generics");
}
}
Comment on lines +1 to +13
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Usually, I like the idea of having everything automated, but I am not sure if we should go with a build.rs and an additional dependency. My first idea was to use a feature flag (that would probably be off by default) that the user can enable if desired.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My first idea was to use a feature flag (that would probably be off by default) that the user can enable if desired.

I think feature flags to enable things that are already available in the latest stable Rust and have no further compile time or dependency drawbacks makes no sense.

  1. You could end up with dozens of feature flags for minor features, or hold back progress because a feature that would otherwise be merged would now be too minor to accept due to introducing a new feature flag.
  2. You end up with useless features that you can't remove without a breaking change as your MSRV goes up.
  3. It's un-ergonomic as it adds an extra step for the user.

These drawbacks while it could be done completely and correctly automatically are unacceptable in my opinion. If you are hesitant regarding the version-check dependency, I'd just like to note that it's tiny, has no further downstream dependencies, and is already relied on by crates such as time, nom, rocket, fd-find among others.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Philippe-Cholet, @phimuemue, perhaps it's time we updated our MSRV to 1.51 (which is two years old at this point).

While I don't mind the version-detection approach, I would like us to adopt it in tandem with changes to our CI that ensure we are testing on all detected versions. I'd also like to perhaps avoid taking the dependency on rust_version. This would all be a substantial change, and outside the scope of this PR.

My vote is that we increase our MSRV. We can aways decrease it in the future.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jswrenn I sure don't mind increasing the MSRV but I would suggest we release the 0.13.0 first, and then increase the MSRV in 0.14.0 to not require the build script (in which case orlp will have enough time to work on this).

65 changes: 57 additions & 8 deletions src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,7 @@
//!
//! ## Rust Version
//!
//! This version of itertools requires Rust 1.32 or later.
//! This version of itertools requires Rust 1.34 or later.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If your assessment is correct, we could possibly increment the minimum rust version in a separate commit?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean a pull request? It's already a separate commit.

#![doc(html_root_url="https://docs.rs/itertools/0.8/")]

#[cfg(not(feature = "use_std"))]
Expand Down Expand Up @@ -214,6 +214,8 @@ mod merge_join;
mod minmax;
#[cfg(feature = "use_alloc")]
mod multipeek_impl;
#[cfg(has_min_const_generics)]
mod next_array;
mod pad_tail;
#[cfg(feature = "use_alloc")]
mod peek_nth;
Expand Down Expand Up @@ -1679,6 +1681,58 @@ pub trait Itertools : Iterator {
}

// non-adaptor methods
/// Advances the iterator and returns the next items grouped in an array of
/// a specific size.
///
/// If there are enough elements to be grouped in an array, then the array
/// is returned inside `Some`, otherwise `None` is returned.
///
/// Requires Rust version 1.51 or later.
///
/// ```
/// use itertools::Itertools;
///
/// let mut iter = 1..5;
///
/// assert_eq!(Some([1, 2]), iter.next_array());
/// ```
#[cfg(has_min_const_generics)]
fn next_array<T, const N: usize>(&mut self) -> Option<[T; N]>
where
Self: Sized + Iterator<Item = T>,
{
next_array::next_array(self)
}


/// Collects all items from the iterator into an array of a specific size.
///
/// If the number of elements inside the iterator is **exactly** equal to
/// the array size, then the array is returned inside `Some`, otherwise
/// `None` is returned.
///
/// Requires Rust version 1.51 or later.
///
/// ```
/// use itertools::Itertools;
///
/// let iter = 1..3;
///
/// if let Some([x, y]) = iter.collect_array() {
/// assert_eq!([x, y], [1, 2])
/// } else {
/// panic!("Expected two elements")
/// }
/// ```
#[cfg(has_min_const_generics)]
fn collect_array<T, const N: usize>(mut self) -> Option<[T; N]>
where
Self: Sized + Iterator<Item = T>,
{
self.next_array().filter(|_| self.next().is_none())
}


/// Advances the iterator and returns the next items grouped in a tuple of
/// a specific size (up to 12).
///
Expand All @@ -1699,6 +1753,7 @@ pub trait Itertools : Iterator {
T::collect_from_iter_no_buf(self)
}


/// Collects all items from the iterator into a tuple of a specific size
/// (up to 12).
///
Expand All @@ -1721,13 +1776,7 @@ pub trait Itertools : Iterator {
where Self: Sized + Iterator<Item = T::Item>,
T: traits::HomogeneousTuple
{
match self.next_tuple() {
elt @ Some(_) => match self.next() {
Some(_) => None,
None => elt,
},
_ => None
}
self.next_tuple().filter(|_| self.next().is_none())
Comment on lines -1724 to +1779
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this really relevant to this PR? If not, could we separate it into another PR?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was mostly to be consistent with the other implementation. As it's just a stylistic change I don't think it's worth a pull request by itself to be honest.

}


Expand Down
80 changes: 80 additions & 0 deletions src/next_array.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,80 @@
use core::mem::MaybeUninit;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


/// Helper struct to build up an array element by element.
struct ArrayBuilder<T, const N: usize> {
arr: [MaybeUninit<T>; N],
i: usize
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the safety invariant of i with relation to arr?

}

impl<T, const N: usize> ArrayBuilder<T, N> {
pub fn new() -> Self {
Self { arr: maybe_uninit::uninit_array(), i: 0 }
}

pub unsafe fn push_unchecked(&mut self, x: T) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs a safety comment, in the format of:

Suggested change
pub unsafe fn push_unchecked(&mut self, x: T) {
/// Does XYZ.
///
/// # Safety
///
/// Callers promises that blah blah blah.
///
/// # Panics
///
/// This method does (or does not) panic.
pub unsafe fn push_unchecked(&mut self, x: T) {

debug_assert!(self.i < N);
*self.arr.get_unchecked_mut(self.i) = MaybeUninit::new(x);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs a safety comment in the form:

Suggested change
*self.arr.get_unchecked_mut(self.i) = MaybeUninit::new(x);
// SAFETY: By contract on the caller, the safety condition on `get_unchecked_mut` that BLAH BLAH BLAH is satisfied.
*self.arr.get_unchecked_mut(self.i) = MaybeUninit::new(x);

self.i += 1;
}

pub fn take(mut self) -> Option<[T; N]> {
if self.i == N {
unsafe {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you scope this unsafe { ... } block down to just the ptr::read?

// SAFETY: prevent double drop.
self.i = 0;
// SAFETY: [MaybeUninit<T>; N] and [T; N] have the same layout.
Copy link
Member

@jswrenn jswrenn Mar 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this safety comment cite the standard library documentation? While it's true that these two types have the same size and alignment, it's not true that they have the same bit validity.

let init_arr_ptr = &self.arr as *const _ as *const [T; N];
Some(core::ptr::read(init_arr_ptr))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs a SAFETY comment citing what the preconditions of ptr::read are, and proving why they are satisfied.

}
} else {
None
}
}
}

impl<T, const N: usize> Drop for ArrayBuilder<T, N> {
fn drop(&mut self) {
unsafe {
// SAFETY: we only loop over the initialized portion.
for el in &mut self.arr[..self.i] {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't need a loop here -- it's generally better to drop-in-place a whole slice rather than items individually.

maybe_uninit::assume_init_drop(el)
}
}
}
}



/// Equivalent to `it.next_array()`.
pub fn next_array<I, T, const N: usize>(it: &mut I) -> Option<[T; N]>
where
I: Iterator<Item = T>,
{
let mut builder = ArrayBuilder::new();
for el in it.take(N) {
unsafe {
// SAFETY: the take(N) guarantees we never go out of bounds.
builder.push_unchecked(el);
Comment on lines +56 to +57
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure this is sound -- there might be a way for me to override take (or one of the things it calls) in safe code such that this can return more than N things.

Maybe have it be something like

it.try_for_each(|x| builder.try_push(x));

with try_push returning an Option?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a nasty one, I think you're right.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There should still be a take in there though, when using try_for_each.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's right on the edge of soundness. There's no easy demo that I can come up with -- if you try to override take you'll find that that doesn't actually work, for example, because you can't make something of the right type without unsafe.

So it's possible that it's actually sound today, but there's so many nuances to that argument that I think it's probably better to consider it unsound. For example, if Rust one day added a way to "call super" -- which seems like an entirely plausible feature -- then it'd immediately be obviously-unsound as someone could implement take as super.take(N+1).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There should still be a take in there though, when using try_for_each.

Oh, right, because otherwise you'll consume an extra element. Good catch.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seconding @scottmcm's comment: For our MVP, does push_unchecked really need to be _unchecked?

}
}
builder.take()
}



/// Replacements for unstable core methods, copied from stdlib.
mod maybe_uninit {
use core::mem::MaybeUninit;

pub fn uninit_array<T, const N: usize>() -> [MaybeUninit<T>; N] {
// SAFETY: an uninitialized `[MaybeUninit<_>; N]` is valid.
unsafe { MaybeUninit::<[MaybeUninit<T>; N]>::uninit().assume_init() }
}

pub unsafe fn assume_init_drop<T>(u: &mut MaybeUninit<T>) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't need to replicate the entire stdlib doc comment here, but could you document the safety preconditions of assume_init_drop?

// SAFETY: the caller must guarantee that `self` is initialized and
// satisfies all invariants of `T`.
// Dropping the value in place is safe if that is the case.
core::ptr::drop_in_place(u.as_mut_ptr())
}
}
26 changes: 26 additions & 0 deletions tests/test_core.rs
Original file line number Diff line number Diff line change
Expand Up @@ -301,3 +301,29 @@ fn product1() {
assert_eq!(v[1..3].iter().cloned().product1::<i32>(), Some(2));
assert_eq!(v[1..5].iter().cloned().product1::<i32>(), Some(24));
}


#[test]
fn next_array() {
let v = [1, 2, 3, 4, 5];
let mut iter = v.iter();
assert_eq!(iter.next_array(), Some([]));
assert_eq!(iter.next_array().map(|[&x, &y]| [x, y]), Some([1, 2]));
assert_eq!(iter.next_array().map(|[&x, &y]| [x, y]), Some([3, 4]));
assert_eq!(iter.next_array::<_, 2>(), None);
}

#[test]
fn collect_array() {
let v = [1, 2];
let iter = v.iter().cloned();
assert_eq!(iter.collect_array(), Some([1, 2]));

let v = [1];
let iter = v.iter().cloned();
assert_eq!(iter.collect_array::<_, 2>(), None);

let v = [1, 2, 3];
let iter = v.iter().cloned();
assert_eq!(iter.collect_array::<_, 2>(), None);
}