Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pairwise summation #577

Open
wants to merge 43 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
43 commits
Select commit Hold shift + click to select a range
483b29a
Add function declaration for pairwise_sum
LukeMathWalker Jan 4, 2019
75860b6
Base case: array with 512 elements
LukeMathWalker Jan 4, 2019
b8304c0
Base case: use unrolled_fold
LukeMathWalker Jan 4, 2019
ec3722a
Implemented algorithm for not-base case branch
LukeMathWalker Jan 4, 2019
70d32b7
Implemented pairwise summation algorithm for an iterator parameter
LukeMathWalker Jan 4, 2019
c66dc98
Implemented pairwise summation algorithm for an iterator parameter wi…
LukeMathWalker Jan 4, 2019
175f9a2
Refactored: use fold to sum
LukeMathWalker Jan 4, 2019
8e403af
Refactored: use a constant to reuse size value of recursion base case…
LukeMathWalker Jan 4, 2019
465063f
Added documentation
LukeMathWalker Jan 4, 2019
6427d45
Minor edits to the docs
LukeMathWalker Jan 4, 2019
4414450
Don't forget to add the sum of the last elements (<512 ending block).
LukeMathWalker Jan 4, 2019
aeaad0e
Add a benchmark for summing a contiguous array
LukeMathWalker Jan 5, 2019
75109b1
Benchmarks for arrays of different length
LukeMathWalker Jan 5, 2019
3085194
Don't split midpoint, saving one operation
LukeMathWalker Jan 5, 2019
797e212
Revert "Don't split midpoint, saving one operation"
LukeMathWalker Jan 5, 2019
d2b636b
Benches for sum_axis
LukeMathWalker Jan 5, 2019
b3d2b42
Bench for contiguous sum with integer values
LukeMathWalker Jan 5, 2019
8f95705
Alternative implementation for sum_axis
LukeMathWalker Jan 9, 2019
74a74ae
Revert "Alternative implementation for sum_axis"
LukeMathWalker Jan 9, 2019
a592a7d
Ensure equal block size independently of underlying implementation
LukeMathWalker Jan 9, 2019
f73fb2d
Change threshold names
LukeMathWalker Jan 22, 2019
c7fa091
Change sum_axis implementation
LukeMathWalker Jan 22, 2019
f72164a
Reduce partial accumulators pairwise in unrolled_fold
LukeMathWalker Jan 22, 2019
9f1c4d2
Remove unused imports
LukeMathWalker Jan 22, 2019
bbc4a75
Get uniform behaviour across all pairwise_sum implementations
LukeMathWalker Jan 22, 2019
b98e30b
Add more benchmarks of sum/sum_axis
jturner314 Feb 3, 2019
ed88e2e
Improve performance of iterator_pairwise_sum
jturner314 Feb 3, 2019
e7835ee
Make sum pairwise over all dimensions
jturner314 Feb 3, 2019
8301c25
Implement contiguous sum_axis in terms of Zip
jturner314 Feb 3, 2019
82453df
Remove redundant len_of call
jturner314 Feb 3, 2019
1d51f70
Merge pull request #3 from jturner314/pairwise-summation
LukeMathWalker Feb 3, 2019
978f45a
Added test for axis independence
LukeMathWalker Feb 3, 2019
fa0ba30
Make sure we actually exercise the pairwise technique
LukeMathWalker Feb 3, 2019
b4136d7
Test discontinuous arrays
LukeMathWalker Feb 3, 2019
4a63cb3
Add more integer benchmark equivalents
LukeMathWalker Feb 3, 2019
f306b5f
Fix min_stride_axis to prefer axes with length > 1
jturner314 Feb 3, 2019
b7951df
Specialize min_stride_axis for Ix3
jturner314 Feb 3, 2019
3326de4
Enable min_stride_axis as pub(crate) method
jturner314 Feb 3, 2019
65b6046
Simplify fold to use min_stride_axis
jturner314 Feb 3, 2019
b0b391a
Improve performance of sum in certain cases
jturner314 Feb 3, 2019
7f04e6f
Update quickcheck and use quickcheck_macros
jturner314 Feb 3, 2019
1ed1a63
Clarify capacity calculation in iterator_pairwise_sum
jturner314 Feb 4, 2019
1e88385
Merge pull request #4 from jturner314/pairwise-summation
LukeMathWalker Feb 4, 2019
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
4 changes: 3 additions & 1 deletion Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -45,8 +45,10 @@ serde = { version = "1.0", optional = true }

[dev-dependencies]
defmac = "0.2"
quickcheck = { version = "0.7.2", default-features = false }
quickcheck = { version = "0.8.1", default-features = false }
quickcheck_macros = "0.8"
rawpointer = "0.1"
rand = "0.5.5"

[features]
# Enable blas usage
Expand Down
200 changes: 198 additions & 2 deletions benches/numeric.rs
Original file line number Diff line number Diff line change
@@ -1,8 +1,7 @@

#![feature(test)]

extern crate test;
use test::Bencher;
use test::{black_box, Bencher};

extern crate ndarray;
use ndarray::prelude::*;
Expand All @@ -25,3 +24,200 @@ fn clip(bench: &mut Bencher)
})
});
}


#[bench]
fn contiguous_sum_1e7(bench: &mut Bencher)
{
let n = 1e7 as usize;
let a = Array::linspace(-1e6, 1e6, n);
bench.iter(|| {
a.sum()
});
}

#[bench]
fn contiguous_sum_int_1e7(bench: &mut Bencher)
{
let n = 1e7 as usize;
let a = Array::from_vec((0..n).collect());
bench.iter(|| {
a.sum()
});
}

#[bench]
fn contiguous_sum_1e4(bench: &mut Bencher)
{
let n = 1e4 as usize;
let a = Array::linspace(-1e6, 1e6, n);
bench.iter(|| {
a.sum()
});
}

#[bench]
fn contiguous_sum_int_1e4(bench: &mut Bencher)
{
let n = 1e4 as usize;
let a = Array::from_vec((0..n).collect());
bench.iter(|| {
a.sum()
});
}

#[bench]
fn contiguous_sum_1e2(bench: &mut Bencher)
{
let n = 1e2 as usize;
let a = Array::linspace(-1e6, 1e6, n);
bench.iter(|| {
a.sum()
});
}

#[bench]
fn contiguous_sum_int_1e2(bench: &mut Bencher)
{
let n = 1e2 as usize;
let a = Array::from_vec((0..n).collect());
bench.iter(|| {
a.sum()
});
}

#[bench]
fn contiguous_sum_ix3_1e2(bench: &mut Bencher)
{
let n = 1e2 as usize;
let a = Array::linspace(-1e6, 1e6, n * n * n)
.into_shape([n, n, n])
.unwrap();
bench.iter(|| black_box(&a).sum());
}

#[bench]
fn contiguous_sum_int_ix3_1e2(bench: &mut Bencher)
{
let n = 1e2 as usize;
let a = Array::from_vec((0..n.pow(3)).collect())
.into_shape([n, n, n])
.unwrap();
bench.iter(|| black_box(&a).sum());
}

#[bench]
fn inner_discontiguous_sum_ix3_1e2(bench: &mut Bencher)
{
let n = 1e2 as usize;
let a = Array::linspace(-1e6, 1e6, n * n * 2*n)
.into_shape([n, n, 2*n])
.unwrap();
let v = a.slice(s![.., .., ..;2]);
bench.iter(|| black_box(&v).sum());
}

#[bench]
fn inner_discontiguous_sum_int_ix3_1e2(bench: &mut Bencher)
{
let n = 1e2 as usize;
let a = Array::from_vec((0..(n.pow(3) * 2)).collect())
.into_shape([n, n, 2*n])
.unwrap();
let v = a.slice(s![.., .., ..;2]);
bench.iter(|| black_box(&v).sum());
}

#[bench]
fn middle_discontiguous_sum_ix3_1e2(bench: &mut Bencher)
{
let n = 1e2 as usize;
let a = Array::linspace(-1e6, 1e6, n * 2*n * n)
.into_shape([n, 2*n, n])
.unwrap();
let v = a.slice(s![.., ..;2, ..]);
bench.iter(|| black_box(&v).sum());
}

#[bench]
fn middle_discontiguous_sum_int_ix3_1e2(bench: &mut Bencher)
{
let n = 1e2 as usize;
let a = Array::from_vec((0..(n.pow(3) * 2)).collect())
.into_shape([n, 2*n, n])
.unwrap();
let v = a.slice(s![.., ..;2, ..]);
bench.iter(|| black_box(&v).sum());
}

#[bench]
fn sum_by_row_1e4(bench: &mut Bencher)
{
let n = 1e4 as usize;
let a = Array::linspace(-1e6, 1e6, n * n)
.into_shape([n, n])
.unwrap();
bench.iter(|| {
a.sum_axis(Axis(0))
});
}

#[bench]
fn sum_by_row_int_1e4(bench: &mut Bencher)
{
let n = 1e4 as usize;
let a = Array::from_vec((0..n.pow(2)).collect())
.into_shape([n, n])
.unwrap();
bench.iter(|| {
a.sum_axis(Axis(0))
});
}

#[bench]
fn sum_by_col_1e4(bench: &mut Bencher)
{
let n = 1e4 as usize;
let a = Array::linspace(-1e6, 1e6, n * n)
.into_shape([n, n])
.unwrap();
bench.iter(|| {
a.sum_axis(Axis(1))
});
}

#[bench]
fn sum_by_col_int_1e4(bench: &mut Bencher)
{
let n = 1e4 as usize;
let a = Array::from_vec((0..n.pow(2)).collect())
.into_shape([n, n])
.unwrap();
bench.iter(|| {
a.sum_axis(Axis(1))
});
}

#[bench]
fn sum_by_middle_1e2(bench: &mut Bencher)
{
let n = 1e2 as usize;
let a = Array::linspace(-1e6, 1e6, n * n * n)
.into_shape([n, n, n])
.unwrap();
bench.iter(|| {
a.sum_axis(Axis(1))
});
}

#[bench]
fn sum_by_middle_int_1e2(bench: &mut Bencher)
{
let n = 1e2 as usize;
let a = Array::from_vec((0..n.pow(3)).collect())
.into_shape([n, n, n])
.unwrap();
bench.iter(|| {
a.sum_axis(Axis(1))
});
}
29 changes: 23 additions & 6 deletions src/dimension/dimension_trait.rs
Original file line number Diff line number Diff line change
Expand Up @@ -291,8 +291,8 @@ pub trait Dimension : Clone + Eq + Debug + Send + Sync + Default +
indices
}

/// Compute the minimum stride axis (absolute value), under the constraint
/// that the length of the axis is > 1;
/// Compute the minimum stride axis (absolute value), preferring axes with
/// length > 1.
#[doc(hidden)]
fn min_stride_axis(&self, strides: &Self) -> Axis {
let n = match self.ndim() {
Expand All @@ -301,7 +301,7 @@ pub trait Dimension : Clone + Eq + Debug + Send + Sync + Default +
n => n,
};
axes_of(self, strides)
.rev()
.filter(|ax| ax.len() > 1)
.min_by_key(|ax| ax.stride().abs())
.map_or(Axis(n - 1), |ax| ax.axis())
}
Expand Down Expand Up @@ -588,9 +588,9 @@ impl Dimension for Dim<[Ix; 2]> {

#[inline]
fn min_stride_axis(&self, strides: &Self) -> Axis {
let s = get!(strides, 0) as Ixs;
let t = get!(strides, 1) as Ixs;
if s.abs() < t.abs() {
let s = (get!(strides, 0) as isize).abs();
let t = (get!(strides, 1) as isize).abs();
if s < t && get!(self, 0) > 1 {
Axis(0)
} else {
Axis(1)
Expand Down Expand Up @@ -697,6 +697,23 @@ impl Dimension for Dim<[Ix; 3]> {
Some(Ix3(i, j, k))
}

#[inline]
fn min_stride_axis(&self, strides: &Self) -> Axis {
let s = (get!(strides, 0) as isize).abs();
let t = (get!(strides, 1) as isize).abs();
let u = (get!(strides, 2) as isize).abs();
let (argmin, min) = if t < u && get!(self, 1) > 1 {
(Axis(1), t)
} else {
(Axis(2), u)
};
if s < min && get!(self, 0) > 1 {
Axis(0)
} else {
argmin
}
}

/// Self is an index, return the stride offset
#[inline]
fn stride_offset(index: &Self, strides: &Self) -> isize {
Expand Down