Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Suggest using iterators instead of indexing #287

Open
rfortiz opened this issue Jan 7, 2023 · 0 comments
Open

Suggest using iterators instead of indexing #287

rfortiz opened this issue Jan 7, 2023 · 0 comments

Comments

@rfortiz
Copy link

rfortiz commented Jan 7, 2023

Using iterators instead of indexing is another possible optimization. At the end of the speedup refactoring: https://rustwasm.github.io/docs/book/game-of-life/time-profiling.html

At this point, the next lowest hanging fruit for speeding up Universe::tick is removing the allocation and free.

Rust does extra bound checks when indexing arrays. I think in simple cases the compiler can optimize it away (e.g. simple loop with a global .len() check) but that's unlikely with the convolution-like operator needed for GoL. Iterators might also lead to better memory access patterns although it's hard to verify.

I played a bit around and got universe.tick() to run ~2x faster than the version without the modulo. Caveat: measured with criterion, not as wasm, only measured for a 1024x768 grid.

I moved this logic to Cell:

impl Cell {
    fn evolution(self, live_neighbors: u8) -> Cell {
        match (self, live_neighbors) {
            // Rule 1: Any live cell with fewer than two live neighbours
            // dies, as if caused by underpopulation.
            (Cell::Alive, x) if x < 2 => Cell::Dead,
            // Rule 2: Any live cell with two or three live neighbours
            // lives on to the next generation.
            (Cell::Alive, 2) | (Cell::Alive, 3) => Cell::Alive,
            // Rule 3: Any live cell with more than three live
            // neighbours dies, as if by overpopulation.
            (Cell::Alive, x) if x > 3 => Cell::Dead,
            // Rule 4: Any dead cell with exactly three live neighbours
            // becomes a live cell, as if by reproduction.
            (Cell::Dead, 3) => Cell::Alive,
            // All other cells remain in the same state.
            (otherwise, _) => otherwise,
        }
    }
}

and replaced tick by:

pub fn tick(&mut self) {
        let mut next = self.cells.clone();

        let width = self.width as usize;
        let height = self.height as usize;

        let next_rows = next.chunks_exact_mut(width);
        let rows_before = self.cells.chunks_exact(width).cycle().skip(height - 1);
        let rows = self.cells.chunks_exact(width);
        let rows_after = self.cells.chunks_exact(width).cycle().skip(1);

        for (next_row, row_b, row, row_a) in izip!(next_rows, rows_before, rows, rows_after) {
            // manually count active nb for first cell in row
            let first_count = row_b[0] as u8
                + row_b[1] as u8
                + row_b[row_b.len() - 1] as u8
                + row[1] as u8
                + row[row.len() - 1] as u8
                + row_a[0] as u8
                + row_a[1] as u8
                + row_a[row_a.len() - 1] as u8;
            next_row[0] = row[0].evolution(first_count);

            // 3 sliding windows (size=3)
            let next_cells = next_row.iter_mut().skip(1);
            let window_before = row_b.windows(3);
            let window = row.windows(3);
            let window_after = row_a.windows(3);
            for (next_cell, w_b, w, w_a) in izip!(next_cells, window_before, window, window_after) {
                let count = w_b.iter().map(|&v| v as u8).sum::<u8>()
                    + w[0] as u8
                    + w[2] as u8
                    + w_a.iter().map(|&v| v as u8).sum::<u8>();
                *next_cell = w[1].evolution(count);
            }

            // manually count active nb for last cell in row
            let last_count = row_b[0] as u8
                + row_b[row_b.len() - 2] as u8
                + row_b[row_b.len() - 1] as u8
                + row[0] as u8
                + row[row.len() - 2] as u8
                + row_a[0] as u8
                + row_a[row_a.len() - 2] as u8
                + row_a[row_a.len() - 1] as u8;
            next_row[next_row.len() - 1] = row[row.len() - 1].evolution(last_count);
        }

        self.cells = next;
    }

Granted it's not the most readable and there is probably a nicer way but it might be good to remind that Rust array indexing is not zero-cost.

(it's using izip from the itertools crate but could nest calls to zip to avoid extra dependency)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant