Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement Combinations::nth #1

Closed

Conversation

kinto-b
Copy link

@kinto-b kinto-b commented Apr 13, 2024

Hi there, this PR addresses rust-itertools#301 in the same way as rust-itertools#329. @Philippe-Cholet I was indeed wrong that this wouldn't give a substantial performance gain

cargo bench --bench specializations "combinations[1234]/nth"

combinations1/nth       time:   [4.8406 ms 4.8562 ms 4.8734 ms]
                        change: [-53.963% -53.706% -53.442%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 15 outliers among 100 measurements (15.00%)
  6 (6.00%) high mild
  9 (9.00%) high severe

combinations2/nth       time:   [4.6338 ms 4.6532 ms 4.6762 ms]
                        change: [-55.198% -54.769% -54.327%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 11 outliers among 100 measurements (11.00%)
  1 (1.00%) high mild
  10 (10.00%) high severe

combinations3/nth       time:   [4.7743 ms 4.8138 ms 4.8600 ms]
                        change: [-53.934% -53.322% -52.707%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 14 outliers among 100 measurements (14.00%)
  8 (8.00%) high mild
  6 (6.00%) high severe

combinations4/nth       time:   [5.1255 ms 5.1524 ms 5.1841 ms]
                        change: [-53.459% -52.613% -51.852%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 18 outliers among 100 measurements (18.00%)
  6 (6.00%) high mild
  12 (12.00%) high severe

As discussed in the comments of rust-itertools#301, an even more efficient solution which would be performant even in the large n/k regime, would be to use the combinatorial number system. For record keeping sake, here's the function I sketched out before I realised that the itertools uses a 'reversed' lexicographic ordering.

    fn nth(&mut self, n: usize) -> Option<Self::Item> {
        self.state += n;

        // https://en.wikipedia.org/wiki/Combinatorial_number_system#Finding_the_k-combination_for_a_given_number
        let mut remainder = self.state;
        let mut delta = 0;
        for i in (0..self.k()).rev() {
            let mut m = i;
            while let Some(d) = checked_binomial(m, i + 1) {
                if d > remainder {
                    self.indices[i] = m - 1;
                    remainder -= delta;
                    break;
                }

                m += 1;
                delta = d;
            }
        }

        // We may need to pad out the pool
        if let Some(x) = self.indices.last() {
            self.pool.prefill(x + 1);
            if *x >= self.n() {
                return None;
            }
        }

        // We should only return an empty list once
        if self.indices.is_empty() && self.state > 0 {
            return None;
        }

        Some(self.indices.iter().map(|i| self.pool[*i].clone()).collect())
    }

It's possible to use this same idea with the lexicographic ordering used by itertools, but we'd need to know number of elements in the underlying iterator. Suppose we do, and we label this N. Then the function above would just need to be modified so that

  1. We initialise remainder = checked_binomial(N, self.k()) - n - 1
  2. We 'invert' the indices before returning the combination so that i = N-1-i and then reverse the order to ensure that the indices remain strictly increasing.

@kinto-b
Copy link
Author

kinto-b commented Apr 13, 2024

Ah sorry had the wrong tab open! Will put the PR in the right repo. How embarrassing

@kinto-b kinto-b closed this Apr 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
1 participant