Naive implementation of cartesian_power. #486

iago-lito · 2020-10-08T21:19:05Z

Hello, this is my first contribution to a rust library, can somebody guide me a little out here?

The idea is to complete the cartesian_product logic with a cartesian_power adaptor. It scrolls the cartesian product of an iterator with itself pow times. For instance with the set abc and pow=4, it yields:

aaaa (0)
aaab (1)
aaac (2)
aaba (3)
aabb (4)
aabc (5)
aaca (6)
aacb (7)
aacc (8)
abaa (9)
...
cccc (80 = 3^4 - 1)

Put it another way, it's supposedly equivalent to iproduct!("abc", "abc", "abc", "abc") except for the technical types quirks.

I have a few unresolved questions, which I need help with:

Dummy stuff first: the diff is big because I thought it was okay to hit the rustfmt button, was I wrong? Should I reverse formatting?
I cannot get the size_hint test to pass because it just.. times out, although I'm unsure why. How am I supposed to correctly test it?
What am I supposed to add in the bench files?
Deepest questions last:
How could the Self::Item type be more generic in this case? I have chosen Vec to start with, but the problem is that the expected returned length actually depends on the pow argument.
- Should I leave it as-is and wait for generic consts to be available?
- Can I improve it by making the returned collection type (here Vec) a type parameter? If yes, how?

This is it, thank you for your patience, I'd be happy to read your feedback :)

jswrenn · 2020-10-08T22:17:21Z

Some quick answers:

Dummy stuff first: the diff is big because I thought it was okay to hit the rustfmt button, was I wrong? Should I reverse formatting?

Yes, please! The smaller the PR, the better, since I need to review every changed line before merging. We're really not that particular about formatting. Just indent with four spaces, and I'm sure it'll be fine.

What am I supposed to add in the bench files?

You don't need to add anything!

I'll need a little more time before I can answer your other questions.

phimuemue · 2020-10-09T07:40:05Z

Hi there. Foremost, thank you for your upfront effort to clarify the modalities.

I hope I do not miss anything, but what is the difference to combinations_with_replacement?

If it is supposed to do the same, you still might look into the existing implementation, and e.g. see if it has a decent size_hint.

iago-lito · 2020-10-09T07:44:08Z

@jswrenn Okay, I'll reverse that so it'll be easier for you to read. Don't get tired scrolling meanwhile :)
BTW would you accept a separate PR with only a thorough rustfmt pass to standardize whitespace in the whole project?

@phimuemue The difference is order, combination_with_replacement is insensitive to order, so it only outputs aab, aba OR baa, correct? On the other hand cartesian_power will output aab, aba AND baa.

phimuemue · 2020-10-09T07:58:41Z

@phimuemue The difference is order, combination_with_replacement is insensitive to order, so it only outputs aab, aba OR baa, correct? On the other hand cartesian_power will output aab, aba AND baa.

Thanks - Too early to read code...

iago-lito · 2020-10-09T10:25:19Z

@jswrenn I have reversed the rustfmt pass so the diff with master will be easier for you to read :)

phimuemue

Hi there, thanks for the ideas in this PR.

When we do this, we should possibly ask ourselves inhowfar Power is similar to CombinationsWithReplacement. (To be honest, it seems quite similar to me, so I would expect similar inner workings, too. As always, other thought-out opinions are welcome.)

Should we go with Vec or with LazyBuffer?
Should we have a Vec<I>, iterating on demand to get the next element, or should we go with the index-based approach like in CombinationsWithReplacement?

phimuemue · 2020-10-09T12:17:49Z

src/adaptors/mod.rs

+                Some(res)
+            }
+            // Check trivial case last as it should be less frequent.
+            Power::Empty => None,


For pow==0/Power::Empty, shouldn't the iterator should yield a single element (namely the empty vector)? (Instead of yielding no elements at all.)

This would ensure that it.cartesian_power(n) yields it.count().pow(n) elements. (We had a similar case back then where (I think it was) combinations had a special case for n==0, which lead to inconsitencies.)

It would possibly also help in getting rid of some special casings (i.e. possible eliminate Power).

@phimuemue Well, I had this vague feeling I was neglicting something sneaky here. Thank you for pointing it out.
The sneaky stuff is that cartesian_product("abc", 0) should not yield the same as cartesian_product("", 3). And I bet that your criterion it.count().pow(n) is the right thing to consider. If we agree that:

2 1 0

2 2^2=4 2^1=2 2^0=1

1 1^2=1 1^1=1 1^1=1

0 0^2=0 0^1=0 0^0=1?

Then the adaptor should yield:

2 1 0

"ab" ["aa", "ab", "ba", "bb"] ["a", "b"] [""]

"a" ["aa"] ["a"] [""]

"" [] [] [""]?

So there is not 1 degenerated case, but 2 or 3, depending on the convention we pick for the 0^0 situation. I propose we stick to the "combinatorics" convention that 0^0=1, so there are only 2 corner cases. The only other solution I can think of is erroring or panicking instead when encountering cartesian_product("", 0).

I'll update the PR to better accomodate these degenerated situations :)

[from the future] see 9027ef0

src/adaptors/mod.rs

jswrenn · 2020-10-09T14:59:40Z

How could the Self::Item type be more generic in this case? I have chosen Vec to start with, but the problem is that the expected returned length actually depends on the pow argument.

Here's another way to look at it: the pow argument equals the returned length, and is thus redundant. You could nix the argument and determine the appropriate Item with a type parameter instead.

iago-lito · 2020-10-09T17:58:53Z

inhowfar Power is similar to CombinationsWithReplacement. (To be honest, it seems quite similar to me, so I would expect similar inner workings, too.

@phimuemue I think it is similar in its signature, because they both yield iterables whose length is eventually decided by the user. I've tried to inspire from the combination_with_replacement signature.

This said, I doubt it implies that the inner workings should resemble each other. I could think of a very inefficient implementation of cartesian_product chaining combination_with_replacement, permutations then sort, but I'm not sure this helps much, except maybe for explaining what it does?

iago-lito · 2020-10-09T18:05:47Z

@phimuemue

Should we go with Vec or with LazyBuffer?
Should we have a Vec<I>, iterating on demand to get the next element, or should we go with the index-based approach like in CombinationsWithReplacement?

I'm not exactly sure what you mean, but this is sure something I would like to improve. My most flexible idea so far would be to make the user provide any FromIterator<I::Item> type instead of Vec, used as a type argument <C> in next so the procedure would build the return value with let res = state.iter().cloned().collect::<C>(); instead of let res: Vec<_> = state.clone();. But I couldn't get this right.

The other option I like is what @jswrenn suggests:

You could nix the argument and determine the appropriate Item with a type parameter instead.

If you mean that the user would do something like it.cartesian_product<[char; 4]>(), then I would also find it awesome, but I'm afraid it requires generic consts to be featured first?

phimuemue · 2020-10-10T10:06:47Z

The other option I like is what @jswrenn suggests:

You could nix the argument and determine the appropriate Item with a type parameter instead.

If you mean that the user would do something like it.cartesian_product<[char; 4]>(), then I would also find it awesome, but I'm afraid it requires generic consts to be featured first?

If we want the type-based approach, I could imagine it is possible to do it right now - with an upper bound on the array length.

However, the type-based approach would prevent us from specifying pow at runtime. So, I suggest we first make clear if we want to have a dynamic/run-time pow, or if we want to go with a type-based approach. My guess is that if you know the array length at runtime, you could easily go with iproduct or a nested cartesian_product.

The following probably only applies if we go with the dynamic/runtime approach:

@phimuemue

Should we go with Vec or with LazyBuffer?
Should we have a Vec<I>, iterating on demand to get the next element, or should we go with the index-based approach like in CombinationsWithReplacement?

I'm not exactly sure what you mean, but this is sure something I would like to improve. My most flexible idea so far would be to make the user provide any FromIterator<I::Item> type instead of Vec, used as a type argument <C> in next so the procedure would build the return value with let res = state.iter().cloned().collect::<C>(); instead of let res: Vec<_> = state.clone();. But I couldn't get this right.

CombinationsWithReplacement/Combinations does not require that the iterator is Clone, as it only clones the elements - this is in contrast with your proposal, and I would assume that users expect requirements similar to those of CombinationsWithReplacement/Combinations.

Moreover, CombinationsWithReplacement/Combinations does not unconditionally exhaust the iterator (it uses LazyBuffer, only advancing the iterator on demant) - in contrast to your proposal. Again, I would expect cartesian_power to behave similarily.

Please note that the existing implementations may not be better than your suggestion. All I advocate for is uniform behavior. Still, the existing implementation at least avoids cloneing the iterator, and goes with only cloning the elements instead.

iago-lito · 2020-10-10T10:47:08Z

@phimuemue Okay, I'll have a deeper look into CombinationWithReplacement internals so I can answer you better.

I agree about the runtime pow vs. compile time pow valuation. These may actually require different implementations, maybe a (procedural?) macro like ipower!(my_it, 8) would do the trick for the compile time use case?

My guess is that if you know the array length at runtime, you could easily go with iproduct or a nested cartesian_product.

I disagree, because iproduct!(my_it, my_it, ..., my_it) yield tuples, so you cannot iterate on them. As a user, since I know that the tuple will be uniform in types, I would rather have an array or an iterable instead, which makes ipower!(my_it, 8) different in signature. This said, I suspect the same argument holds for combination_with_replacement, have you already have it?

Regarding runtime pow valuation, I'm okay to consider fixed-sized arrays up to 32 (or maybe more?). I also agree that uniform behaviour is a desirable thing.

Before I understand CombinationWithReplacement deeply, I would like to ask whether there exists clear-cut implementation recommendations valid throughout the library. Put it simply, what's the policy regarding adaptors implementation tradeoffs:

maximize flexibility with respect to the type system? (user is free to pick Vec, [I::Item; n], HashSet, String as yielded items)
maximize execution speed?
minimize RAM usage?
minimize heap allocations?
minimize calls to I::next()?
minimize calls to I::Item::clone()?
minimize calls to I::clone()?

For instance, I can think of an alternate implementation of cartesian_power that does not use I::clone(), but it requires buffering the whole iterator sequence, which may be not desirable if I turns out to be really long.

phimuemue · 2020-10-10T11:08:33Z

Before I understand CombinationWithReplacement deeply, I would like to ask whether there exists clear-cut implementation recommendations valid throughout the library. Put it simply, what's the policy regarding adaptors implementation tradeoffs:
* maximize flexibility with respect to the type system? (user is free to pick `Vec`, `[I::Item; n]`, `HashSet`, `String` as yielded items)
* maximize execution speed?
* minimize RAM usage?
* minimize heap allocations?
* minimize calls to `I::next()`?
* minimize calls to `I::Item::clone()`?
* minimize calls to `I::clone()`?

These are good questions indeed. I do not know if we have specific recommendations. I guess we are usually open to discuss the various tradeoffs - if we do not have any precedence.

If we, however, have precedence, I strongly tend to follow existing patterns. It enforces uniformity, and builds upon possibly good decisions already made by others. If an existing design turns out to be bad, I tend to first fix that, and then build the new thing. In various past situations this technique simplified the code base quite a bit.

In this case, and if we go with the runtime approach, this might even mean looking at CombinationsWithReplacement and Combinations as a first step (to see if they should actually be more similar (see e.g. #487) and if they can be reused for cartesian_power).

But maybe wait for @jswrenn's opinion on all this.

iago-lito · 2020-12-04T09:48:03Z

@jswrenn Friendly ping. I suspect you're maybe busy with safe transmute or other stuff. Feel free to ignore me if it's the case, there's no rush here :)

DusterTheFirst · 2022-03-04T13:12:30Z

How would this implementation differ/be better than something like this:

use std::fmt::Debug;

use itertools::Itertools;

fn cartesian_power<const N: usize, T, I>(iterator: I) -> impl Iterator<Item = [T; N]>
where
    T: Debug + Clone,
    I: Iterator<Item = T> + Clone,
{
    (0..N)
        .map(|_i| iterator.clone())
        .multi_cartesian_product()
        .map(|vec| TryInto::<[T; N]>::try_into(vec).unwrap())
}

macro_rules! cartesian_power {
    ($iterator:expr, $count:expr) => {
        $crate::cartesian_power::<$count, _, _>($iterator)
    };
}

fn main() {
    for [x, y, z] in cartesian_power::<3, _, _>(0..4) {
        println!("({x},{y},{z})");
    }

    for [a, b, c, d, e] in cartesian_power!(0..4, 5) {
        println!("({a},{b},{c},{d},{e})");
    }
}

iago-lito · 2022-03-13T10:27:34Z

@DusterTheFirst I guess it would differ in one or several of the tradeoffs listed here, but IIRC we're still waiting on @jswrenn input here, are we? There was questionning about the (existence of a) general implementation policy in itertools.

Naive implementation of cartesian_power.

cbc2c3f

Reverse rustfmt pass.

8e61fb9

Add missing use statements.

8c5f630

phimuemue reviewed Oct 9, 2020

View reviewed changes

Distinguish different degenerated cases.

9027ef0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Naive implementation of cartesian_power. #486

Naive implementation of cartesian_power. #486

iago-lito commented Oct 8, 2020 •

edited

jswrenn commented Oct 8, 2020

phimuemue commented Oct 9, 2020 •

edited

iago-lito commented Oct 9, 2020

phimuemue commented Oct 9, 2020

iago-lito commented Oct 9, 2020 •

edited

phimuemue left a comment

phimuemue Oct 9, 2020

iago-lito Oct 9, 2020 •

edited

jswrenn commented Oct 9, 2020

iago-lito commented Oct 9, 2020

iago-lito commented Oct 9, 2020

phimuemue commented Oct 10, 2020 •

edited

iago-lito commented Oct 10, 2020

phimuemue commented Oct 10, 2020

iago-lito commented Dec 4, 2020

DusterTheFirst commented Mar 4, 2022

iago-lito commented Mar 13, 2022

	2	1	0
"ab"	`["aa", "ab", "ba", "bb"]`	`["a", "b"]`	`[""]`
"a"	`["aa"]`	`["a"]`	`[""]`
""	`[]`	`[]`	`[""]`?

Naive implementation of cartesian_power. #486

Are you sure you want to change the base?

Naive implementation of cartesian_power. #486

Conversation

iago-lito commented Oct 8, 2020 • edited

jswrenn commented Oct 8, 2020

phimuemue commented Oct 9, 2020 • edited

iago-lito commented Oct 9, 2020

phimuemue commented Oct 9, 2020

iago-lito commented Oct 9, 2020 • edited

phimuemue left a comment

Choose a reason for hiding this comment

phimuemue Oct 9, 2020

Choose a reason for hiding this comment

iago-lito Oct 9, 2020 • edited

Choose a reason for hiding this comment

jswrenn commented Oct 9, 2020

iago-lito commented Oct 9, 2020

iago-lito commented Oct 9, 2020

phimuemue commented Oct 10, 2020 • edited

iago-lito commented Oct 10, 2020

phimuemue commented Oct 10, 2020

iago-lito commented Dec 4, 2020

DusterTheFirst commented Mar 4, 2022

iago-lito commented Mar 13, 2022

iago-lito commented Oct 8, 2020 •

edited

phimuemue commented Oct 9, 2020 •

edited

iago-lito commented Oct 9, 2020 •

edited

iago-lito Oct 9, 2020 •

edited

phimuemue commented Oct 10, 2020 •

edited