Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve performance of .fold() #574

Merged
merged 2 commits into from Dec 15, 2018
Merged

Conversation

jturner314
Copy link
Member

This PR does two things that improve the performance of .fold() for ArrayBase:

  1. Specialize the process of finding the inner axis for 2-D arrays.
  2. Implement .fold() for the Axes iterator. This is an improvement for arrays with more than 2 axes.

For the function below with a 320x320 input array, this PR improves the performance by ~50%.

const RADIUS: usize = 1;
const WINDOW_SIZE: usize = 2 * RADIUS + 1;

fn sum_sq_diff_windows(data: ArrayView2<f64>) -> Array2<f64> {
    let mut out = Array2::zeros((data.rows() - 2 * RADIUS, data.cols() - 2 * RADIUS));
    Zip::from(&mut out)
        .and(data.windows((WINDOW_SIZE, WINDOW_SIZE)))
        .apply(|out, window| {
            let center = window[(RADIUS, RADIUS)];
            *out = window.fold(0., |acc, x| acc + (x - center).powi(2));
        });
    out
}

@bluss
Copy link
Member

bluss commented Dec 15, 2018

Nice. So it'a win even in a 3x3 array?

@bluss bluss merged commit 03552e2 into rust-ndarray:master Dec 15, 2018
@jturner314
Copy link
Member Author

jturner314 commented Dec 15, 2018

So it'a win even in a 3x3 array?

Yes, each window is 3x3, and this PR significantly improves the performance because it reduces the cost of each .fold() call. (It doesn't improve the iteration performance within .fold(); it reduces the cost of determining the iteration order.)

@jturner314 jturner314 deleted the optimize-fold branch December 15, 2018 20:23
@bluss
Copy link
Member

bluss commented Dec 15, 2018

Ah, of course. Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants