Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add find_or_last method to Itertools trait #535

Merged
merged 7 commits into from May 4, 2021

Conversation

mankinskin
Copy link
Contributor

@mankinskin mankinskin commented Apr 10, 2021

Playground

fn find_or_last(mut self, predicate: impl FnMut(&Self::Item) -> bool) -> Option<Self::Item>
  • returns None if iterator is empty
  • returns Some(element) when element matches the predicate
  • returns Some(last_element) when no element matches the predicate

Related discussion on the rust user forum

@mankinskin mankinskin changed the title Add find_or_last method to Itertools trait Add find_or_last method to Itertools trait Apr 10, 2021
Copy link
Member

@phimuemue phimuemue left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi there! I know of libraries that support an operation like this, so this might be useful for Itertools, too.

One question: Might it be useful for the user to distinguish between "successful search" and "last element because nothing else succeeded"? (We have a similar (or isn't it?) issue with exactly_one where a fail could occur because there are either no elements or more than one element.)

src/lib.rs Outdated Show resolved Hide resolved
src/lib.rs Outdated Show resolved Hide resolved
@scottmcm
Copy link
Contributor

Pondering: why is last the right option? Should there also be find_or_first?

Would it be fine for people to just do .with_position().find(|x| x.is_last() || pred(x.as_inner()))? (Which admittedly doesn't compile right now since those methods don't exist on itertools::Position, but they'd be easy additions.)

Some thoughts on alternative implementation strategies:

My first thought was that this should do the fold1 trick to make the accumulator be a T instead of an Option<T> to reduce the size of the variable getting passed along:

pub fn find_or_last<T>(
    mut it: impl Iterator<Item = T>,
    mut pred: impl FnMut(&T) -> bool,
) -> Option<T> {
    let first = it.next()?;
    Some(
        it.try_fold(first, |_, x| if pred(&x) { Err(x) } else { Ok(x) })
            .map_or_else(|x| x, |x| x),
    )
}

But the previous accumulator being unused lead to my second thought, namely that if the code doesn't need move access to the accumulator, it probably shouldn't thread it at all. That would look something like this:

pub fn find_or_last<T>(
    mut it: impl Iterator<Item = T>,
    mut pred: impl FnMut(&T) -> bool,
) -> Option<T> {
    let mut prev = None;
    it.try_for_each(|x| if pred(&x) { Err(x) } else { Ok(prev = Some(x)) })
        .err()
        .or(prev)
}

Which I find pretty elegant, actually. (And avoids the extra .next() callsite of the previous one, which generally helps binary size.)

@mankinskin
Copy link
Contributor Author

mankinskin commented Apr 10, 2021

@scottmcm you might want to look at this reply by steffahn, where he suggests the same approach, with some more refactoring. The result would look like this:

fn find_or_last<P>(mut self, mut predicate: P) -> Option<Self::Item>
    where Self: Sized,
          P: FnMut(&Self::Item) -> bool,
{
    let mut prev = None;
    self.find_map(|x| if predicate(&x) { Some(x) } else { prev = Some(x); None })
        .or(prev)
}

if I am not mistaken.

@mankinskin
Copy link
Contributor Author

mankinskin commented Apr 10, 2021

About the option of using with_position, this would probably be less efficient, because it wraps each value in an enum and needs to peek for the next element.

But with_position could probably be used to satisfy more use-cases, if we got that code example you gave to compile.

About why I picked the last value: In my use-case it really doesn't matter which value is returned, I just intuitively thought taking the last value would be the most efficient. But, as it turns out, getting the last value requires an accumulator, so it is probably less efficient than taking the first element.

using accumulator variable and find_map
@mankinskin
Copy link
Contributor Author

I suppose a find_or_first would just look like this?

fn find_or_first<P>(mut self, mut predicate: P) -> Option<Self::Item>
    where Self: Sized,
          P: FnMut(&Self::Item) -> bool,
{
    let first = self.next()?;
    Some(if predicate(&first) {
        first
    } else {
        self.find(|x| predicate(&x)).unwrap_or(first)
    })
}

Copy link
Member

@phimuemue phimuemue left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apologies if this review is too brief, I do not have much time at the moment.

As far as I can see, the implementations look ok, although I preferred the try_fold approach for some reason... Cannot really substantiate this feeling. Maybe it would lead to similar implementations for find_or_last and find_or_first? Anyways, my feeling should not be a blocker.

My remark regarding the return type stands: Maybe some users want to differentiate between "nothing found because iterator empty" and "nothing found, so the last (resp. first) element was returned instead". On the other hand, we could include your change and wait for actual complaints from users.

@SkiFire13
Copy link
Contributor

although I preferred the try_fold approach for some reason

Note that custom iterators can't implement try_fold because it requires the unstable Try trait, however they can implement find_map

@mankinskin
Copy link
Contributor Author

mankinskin commented Apr 14, 2021

@phimuemue yes, the idea is still good, and I don't see how it would hurt anything, except maybe requiring something like an .into_inner() call on the output to convert it to an Option, in case you don't care about how the value was selected:

enum FindOrDefaultResult<T> {
    Found(T),
    NotFound(T),
    Empty,
}
impl<T> FindOrDefaultResult<T> {
    pub fn into_inner(self) -> Option<T> {
        match self {
            Self::Found(x) |
            Self::NotFound(x) => Some(x),
            Self::Empty => None,
        }
    }
}

Theoretically this could even lead to more kinds of find_or_default implementations, where the default somehow depends on the iterator.. for example by using an enum to specify how to select the default from the iterator:

enum DefaultSelector {
    First,
    Last,
    AtPosition(usize),
    Random(Box<dyn rand::RngCore>),
    ...
}

But it is probably better to wait for someone who actually needs something like this to make a proposal, rather than blindly implementing something.

@mankinskin
Copy link
Contributor Author

I am unsure now, should I implement the FindOrDefaultResult too, or do we leave it as it is?

@phimuemue
Copy link
Member

I am unsure now, should I implement the FindOrDefaultResult too, or do we leave it as it is?

I'd leave it as it is and wait for @jswrenn 's opinion. Tbh, I second your idea of not blindly implementing abstractions that may not even be needed.

Copy link
Member

@jswrenn jswrenn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Happy to merge this as-is. We can revisit improvements in the future, as needed.

bors r+

@jswrenn jswrenn added this to the next milestone May 4, 2021
@bors
Copy link
Contributor

bors bot commented May 4, 2021

Build succeeded:

@bors bors bot merged commit 271a341 into rust-itertools:master May 4, 2021
@mankinskin mankinskin deleted the find-or-last branch July 25, 2021 15:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants