Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue/655: Standard deviation and variance #753

Closed
wants to merge 3 commits into from

Conversation

allmywatts
Copy link

@allmywatts allmywatts commented Nov 12, 2019

This PR implements std and var methods (see: #655) using the same method as used for var_axis, std_axis.
The variance is computed by flattening the array of N dimensions into a 1D array of the length .len() of the original array.

The return type is scalar A instead of Array<A, D:Smaller>

An attempt was made to refactor var_axis to use var, but this resulted in a performance regression so original implementation is kept and we accept some code duplication. 馃槢

pub fn var_axis(&self, axis: Axis, ddof: A) -> Array<A, D::Smaller>
where
    A: Float + FromPrimitive,
    D: RemoveAxis,
{
    let mut sum_sq = Array::<A, D::Smaller>::zeros(self.dim.remove_axis(axis));

    for (lane, sum) in self.lanes(axis).into_iter().zip(sum_sq.iter_mut()) {
        *sum = lane.var(ddof);
    }
    sum_sq
}

Thanks @LukeMathWalker for the mentoring. 馃榾

This PR implements std and var methods using the same
method as used for var_axis, std_axis.
The variance is computed by flattened the array into
a 1D array of the .len() of original array.

The return type is scalar A instead of Array<A, D:Smaller>

An attempt was made to refactor var_axis to use var, but this
resulted in a performance regression so original implementations
are kept and we accept some code duplication.

```
    pub fn var_axis(&self, axis: Axis, ddof: A) -> Array<A, D::Smaller>
    where
        A: Float + FromPrimitive,
        D: RemoveAxis,
    {
        let mut sum_sq = Array::<A, D::Smaller>::zeros(self.dim.remove_axis(axis));

        for (lane, sum) in self.lanes(axis).into_iter().zip(sum_sq.iter_mut()) {
            *sum = lane.var(ddof);
        }
        sum_sq
    }
```
@allmywatts allmywatts marked this pull request as ready for review November 12, 2019 14:51
@LukeMathWalker
Copy link
Member

LukeMathWalker commented Nov 17, 2019

The main issue we had here was that Zip does not allow to iterate over two n-dimensional producers that yield items with different dimensionality (0 for the output array and 1 for the lanes that we want to reduce).

@bluss
Copy link
Member

bluss commented Nov 20, 2019

I don't quite see the issue with using Zip here

@LukeMathWalker
Copy link
Member

Sorry, it took me a while to get back to this - @bluss is indeed right, I was able to put it together using Zip. If you still have the attempts we made together @allmywatts at RustFest I'd be curious to understand what I was doing wrong there 馃

I opened a PR against your branch with the Zip version @allmywatts - allmywatts#1
We might want to have a var_unchecked, as we did at the beginning, that we use from both var and var_axis to avoid redundant input checks in var_axis.

@hameerabbasi
Copy link
Contributor

@allmywatts If you don't mind can I take over this PR and open a new one?

@allmywatts
Copy link
Author

@hameerabbasi hey, sorry for the late reply, yes please go ahead.

@bluss
Copy link
Member

bluss commented Apr 18, 2020

This looks ready to merge, right?

let dof = n - ddof;
let mut mean = A::from_usize(0).expect("Converting 0 to `A` must not fail.");
let mut sum_sq = A::from_usize(0).expect("Converting 0 to `A` must not fail.");
for (count, x) in self.iter().enumerate() {
Copy link
Member

@bluss bluss Apr 20, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to note, iter is not an efficient way to iterate a "flattened" array, so this should potentially use unary Zip instead, for example, or ArrayBase::fold. In the high level, this is documented in the module documentation, about array traversals.

can of course be fixed as a later change.

@bluss
Copy link
Member

bluss commented Apr 20, 2020

Super minor thing, but as long as there is a division and a checked type conversion inside the actual loop, I don't think var_unchecked with the purpose of saving input checks saves that much.

/// [3., 4.],
/// [5., 6.]]);
///
/// let a_flat = a.view().into_shape(6).expect("This must not fail.");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"This must not fail" does not add information, just use .unwrap(). I'm just afraid that the user will conclude some kinds of into_shape() (to 1D) always succeed, and they don't.

@bluss
Copy link
Member

bluss commented Jan 11, 2021

Superseded by #790

@bluss bluss closed this Jan 11, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants