Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature request: a running fold adaptor #798

Open
ronnodas opened this issue Nov 9, 2023 · 3 comments
Open

Feature request: a running fold adaptor #798

ronnodas opened this issue Nov 9, 2023 · 3 comments

Comments

@ronnodas
Copy link

ronnodas commented Nov 9, 2023

It's often useful to keep track of the intermediate results in a fold, as in Haskell's scanl, for example if you want to turn a iterator of numbers into an iterator of running totals. At first this looks like a job for scan, but it was easier to write using successors, as follows:

pub fn fold_map<I, B, F>(iter: I, init: B, mut f: F) -> impl Iterator<Item = B>
where
    I: IntoIterator,
    F: FnMut(&B, I::Item) -> B,
{
    let mut iter = iter.into_iter();
    std::iter::successors(Some(init), move |prev| iter.next().map(|x| f(prev, x)))
}

Then fold_map(1..4, 0, |x, y| x + y).collect_vec() produces [0, 1, 3, 6], see playground link.

Could a version of this be added to itertools?

@Philippe-Cholet
Copy link
Member

Running totals and 0, |x,y| x+y surely makes me think of python's itertools.accumulate which was a bit discussed in #705 and #147. Which is there seen with scan: (0..4).scan(0, |x, y| { *x += y; Some(*x) }).collect_vec() to have [0,1,3,6] as well. But I like your version.

That being said, I don't know if we would add a version of this:

  • There are alternatives but there is still some interest for a simpler intuitive way so it might be worth adding.
  • For me, the name fold_map is not intuitive enough yet, maybe another?
  • Probably just a method in Itertools trait and not a free function.

If we were to try to implement this, I would suggest to benchmark it to see if it's faster or slower than current possibilities.

@ronnodas
Copy link
Author

Running totals and 0, |x,y| x+y surely makes me think of python's itertools.accumulate which was a bit discussed in #705 and #147. Which is there seen with scan: (0..4).scan(0, |x, y| { *x += y; Some(*x) }).collect_vec() to have [0,1,3,6] as well. But I like your version.

Note that the scan version is more the analog of reduce, so will not "yield" the initial value: (1..4).scan(0, |x, y| { *x += y; Some(*x) }).collect_vec() == [1, 3, 6]. This may or may not be what you want in a given application but it is easier to skip(1) than once(init).chain(...). Especially if its type is not Copy but Some(*x) does not work in that case to begin with.

The function I wrote will produce an iterator of length n + 1 if iter is of length n, which is part of why it is hard to write using scan. But the ownership issue from above is harder: scan does not allow keeping a reference to the "yielded" value (assuming the relevant types are not Copy) so the only way I found needed to run the loop "one step ahead" and use mem::replace. Even then you can't get the final value (ie the result of a usual fold) out without cloning it (or replacing with a dummy/default value) because Scan.state is private.

This is part of why I think it's a good candidate for addition to itertools: one "obvious" way you might want to write it seems tricky at best to get right.

An example where the types are not Copy: 2D prefix sums of a 2D grid

For me, the name fold_map is not intuitive enough yet, maybe another?

Absolutely not attached to the name. accumulate or fold_all or fold_running seem like other reasonable choices, but not attached to any of them either. scanl is too close to scan so not good.

Probably just a method in Itertools trait and not a free function.

I would expect to call it as iter.fold_map(...) instead of fold_map(iter, ...) if it was in itertools. Of course then iter could just be Iterator, not IntoIterator.

@Philippe-Cholet
Copy link
Member

Philippe-Cholet commented Nov 10, 2023

The name "accumulate" seems a reasonable name to me.

I agree that we should be able to "accumulate" without Copy, making your "successors"-way more general than the "scan"-way but definitely not obvious and therefore legitimate for itertools.

If it produces n + 1 elements then we should not implement ExactSizeIterator for the resulting iterator because it's just a bit longer. I went to look Python version: it produces n + initial.is_some() elements.
...We could do something like:

  • fn accumulate_from<F: FnMut(&B, Self::Item) -> B>(self, init: B, func: F) -> AccumulateFrom<Self, B, F> which would have n+1 B elements. I first thought of accumulate_with but _with usually means with a function which B is not.
  • fn accumulate<F: FnMut(&Self::Item, Self::Item) -> Self::Item>(self, func: F) -> Accumulate<Self, F> which would have n Self::Item elements (and implement ExactSizeIterator). Maybe with an additonal condition on Self::Item I'm not sure.
  • (with Accumulate probably being some type alias of AccumulateFrom).

EDIT: Working on a draft.

EDIT 2: https://github.com/Philippe-Cholet/itertools/tree/accumulate-draft
I'll surely come back to it at some point.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants