Feature request: a running fold adaptor #798

ronnodas · 2023-11-09T23:00:38Z

It's often useful to keep track of the intermediate results in a fold, as in Haskell's scanl, for example if you want to turn a iterator of numbers into an iterator of running totals. At first this looks like a job for scan, but it was easier to write using successors, as follows:

pub fn fold_map<I, B, F>(iter: I, init: B, mut f: F) -> impl Iterator<Item = B>
where
    I: IntoIterator,
    F: FnMut(&B, I::Item) -> B,
{
    let mut iter = iter.into_iter();
    std::iter::successors(Some(init), move |prev| iter.next().map(|x| f(prev, x)))
}

Then fold_map(1..4, 0, |x, y| x + y).collect_vec() produces [0, 1, 3, 6], see playground link.

Could a version of this be added to itertools?

The text was updated successfully, but these errors were encountered:

Philippe-Cholet · 2023-11-10T08:44:01Z

Running totals and 0, |x,y| x+y surely makes me think of python's itertools.accumulate which was a bit discussed in #705 and #147. Which is there seen with scan: (0..4).scan(0, |x, y| { *x += y; Some(*x) }).collect_vec() to have [0,1,3,6] as well. But I like your version.

That being said, I don't know if we would add a version of this:

There are alternatives but there is still some interest for a simpler intuitive way so it might be worth adding.
For me, the name fold_map is not intuitive enough yet, maybe another?
Probably just a method in Itertools trait and not a free function.

If we were to try to implement this, I would suggest to benchmark it to see if it's faster or slower than current possibilities.

ronnodas · 2023-11-10T10:45:08Z

Running totals and 0, |x,y| x+y surely makes me think of python's itertools.accumulate which was a bit discussed in #705 and #147. Which is there seen with scan: (0..4).scan(0, |x, y| { *x += y; Some(*x) }).collect_vec() to have [0,1,3,6] as well. But I like your version.

Note that the scan version is more the analog of reduce, so will not "yield" the initial value: (1..4).scan(0, |x, y| { *x += y; Some(*x) }).collect_vec() == [1, 3, 6]. This may or may not be what you want in a given application but it is easier to skip(1) than once(init).chain(...). Especially if its type is not Copy but Some(*x) does not work in that case to begin with.

The function I wrote will produce an iterator of length n + 1 if iter is of length n, which is part of why it is hard to write using scan. But the ownership issue from above is harder: scan does not allow keeping a reference to the "yielded" value (assuming the relevant types are not Copy) so the only way I found needed to run the loop "one step ahead" and use mem::replace. Even then you can't get the final value (ie the result of a usual fold) out without cloning it (or replacing with a dummy/default value) because Scan.state is private.

This is part of why I think it's a good candidate for addition to itertools: one "obvious" way you might want to write it seems tricky at best to get right.

An example where the types are not Copy: 2D prefix sums of a 2D grid

For me, the name fold_map is not intuitive enough yet, maybe another?

Absolutely not attached to the name. accumulate or fold_all or fold_running seem like other reasonable choices, but not attached to any of them either. scanl is too close to scan so not good.

Probably just a method in Itertools trait and not a free function.

I would expect to call it as iter.fold_map(...) instead of fold_map(iter, ...) if it was in itertools. Of course then iter could just be Iterator, not IntoIterator.

Philippe-Cholet · 2023-11-10T13:17:57Z

The name "accumulate" seems a reasonable name to me.

I agree that we should be able to "accumulate" without Copy, making your "successors"-way more general than the "scan"-way but definitely not obvious and therefore legitimate for itertools.

If it produces n + 1 elements then we should not implement ExactSizeIterator for the resulting iterator because it's just a bit longer. I went to look Python version: it produces n + initial.is_some() elements.
...We could do something like:

fn accumulate_from<F: FnMut(&B, Self::Item) -> B>(self, init: B, func: F) -> AccumulateFrom<Self, B, F> which would have n+1 B elements. I first thought of accumulate_with but _with usually means with a function which B is not.
fn accumulate<F: FnMut(&Self::Item, Self::Item) -> Self::Item>(self, func: F) -> Accumulate<Self, F> which would have n Self::Item elements (and implement ExactSizeIterator). Maybe with an additonal condition on Self::Item I'm not sure.
(with Accumulate probably being some type alias of AccumulateFrom).

EDIT: Working on a draft.

EDIT 2: https://github.com/Philippe-Cholet/itertools/tree/accumulate-draft
I'll surely come back to it at some point.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature request: a running fold adaptor #798

Feature request: a running fold adaptor #798

ronnodas commented Nov 9, 2023

Philippe-Cholet commented Nov 10, 2023

ronnodas commented Nov 10, 2023

Philippe-Cholet commented Nov 10, 2023 •

edited

Feature request: a running fold adaptor #798

Feature request: a running fold adaptor #798

Comments

ronnodas commented Nov 9, 2023

Philippe-Cholet commented Nov 10, 2023

ronnodas commented Nov 10, 2023

Philippe-Cholet commented Nov 10, 2023 • edited

Philippe-Cholet commented Nov 10, 2023 •

edited