Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement Multivariate normal distribution #583

Open
wants to merge 12 commits into
base: master
Choose a base branch
from
Open

Implement Multivariate normal distribution #583

wants to merge 12 commits into from

Conversation

ManifoldFR
Copy link

@ManifoldFR ManifoldFR commented Jan 14, 2019

Resolves #582

@bluss
Copy link
Member

bluss commented Jan 14, 2019

I'm not sure why this is in ndarray-rand's scope, it's a bit unfortunate to depend on a specific linalg crate. I'd rather decline features (put it into a separate crate?)

@ManifoldFR
Copy link
Author

I made the dependency on linalg can be made optional, in which case you only get a standard normal distribution. Are there any other linalg crates than ndarray-linalg ? From what I see in its issue board the author was thinking about including it to this repo

@jturner314
Copy link
Member

I'm not sure why this is in ndarray-rand's scope

What is the intended scope for ndarray-rand?

put it into a separate crate?

IMO, it makes the most sense to be placed in ndarray-rand behind a feature flag, because users will most likely look in a crate named ndarray-rand for this functionality.

However, it could be placed in ndarray-linalg, since ndarray-linalg already depends on rand and has some random generation functions. It doesn't really seem to fit, though.

Another option would be ndarray-stats if we also add functionality to fit the parameters to data.

Are there any other multivariate random distributions that would justify the creation of a new crate?

Are there any other linalg crates than ndarray-linalg ?

There's linxal, but it doesn't appear to be maintained any longer and is on a very old version of ndarray.

From what I see in its issue board the author was thinking about including it to this repo

Are you referring to rust-ndarray/ndarray-linalg#126? That's talking about moving the ndarray-linalg repository from termoshtt/ndarray-linalg to rust-ndarray/ndarray-linalg. It would still be a separate crate in a repository separate from rust-ndarray/ndarray.

@jturner314
Copy link
Member

@termoshtt What are your thoughts? Would you like to add this functionality to ndarray-linalg?

@ManifoldFR
Copy link
Author

ManifoldFR commented Jan 14, 2019

Another option would be ndarray-stats if we also add functionality to fit the parameters to data.

Are there any other multivariate random distributions that would justify the creation of a new crate?

Not really. From the experience I have, the multivariate normal distribution is really the only standard multivariate distribution that has some interdependence between its variables. Modeling dependence between random variables of other distributions can be done using other tools like copulas, which can still rely on the normal distribution.

Something like ndarray-stats could piggyback on the API from https://github.com/boxtown/statrs, I think, but that crate doesn't do parameter inference, which is the kind of problem that would involve a lot more work (and would be getting into ML territory).

Are there any other linalg crates than ndarray-linalg ?

There's linxal, but it doesn't appear to be maintained any longer and is on a very old version of ndarray.

From what I see in its issue board the author was thinking about including it to this repo

Are you referring to termoshtt/ndarray-linalg#126? That's talking about moving the ndarray-linalg repository from termoshtt/ndarray-linalg to rust-ndarray/ndarray-linalg. It would still be a separate crate in a repository separate from rust-ndarray/ndarray.

Yes, I was referring to that. Since there is no actively-maintained alternative, @termoshtt 's crate seemed like the natural option, to put behind a feature gate.

@LukeMathWalker
Copy link
Member

I was thinking about generating values according to a specified probability distribution this morning and I am curious as well to understand what is the scope of ndarray-rand, but that seems to be the most fitting "home" for this kind of usecase.

Functionalities for parameter estimation/ML would probably require a separate crate, but the Random variable bits in SciPy do seem fitting for ndarray-rand.

@ManifoldFR
Copy link
Author

So, would we be going ahead with this ? I still have a bit of work to do on this though, like adding tests when compiling with the optional feature.

@ManifoldFR
Copy link
Author

For some reason, I can't get the added normal distribution (with covariance matrix) to compile, I keep getting

error[E0599]: no method named `cholesky` found for type `ndarray::ArrayBase<ndarray::OwnedRepr<f64>, ndarray::Dim<[usize; 2]>>` in the current scope
  --> ndarray-rand/src/normal/advanced.rs:22:32
   |
22 |         let lower = covariance.cholesky(UPLO::Lower);
   |    

Can anyone test this on their end ?

@jturner314
Copy link
Member

jturner314 commented Jan 25, 2019

The issue is that extern crate ndarray_linalg; is behind the "normaldist" feature flag, but the advanced module is not (so as far as advanced is concerned, ndarray_linalg is missing). To fix this, you can add an attribute in normal.rs so that advanced is compiled only when the "normaldist" feature is enabled, like this:

#[cfg(feature = "normaldist")]
pub mod advanced;

Note that "normaldist" should also be added as a feature in Cargo.toml, and ndarray-linalg should be made optional.

Edit: Whoops, it looks like that wasn't the root cause of the problem. AFAICT, the root cause is a version mismatch. (ndarray-linalg provides traits for ndarray ^0.12 types, while the ndarray dependency in ndarray-rand's Cargo.toml is ndarray = { version = "0.12.1", path = ".." }. (Note the path = "..".)) I think at this point we can remove the path = "..".

@ManifoldFR
Copy link
Author

I thought using path = "..." was usually necessary when building crates in the same workspace though. I'll try that then

@ManifoldFR ManifoldFR changed the title WIP: Implement Multivariate normal distribution *Implement Multivariate normal distribution Jan 25, 2019
@ManifoldFR ManifoldFR changed the title *Implement Multivariate normal distribution Implement Multivariate normal distribution Jan 25, 2019
@termoshtt
Copy link
Member

What are your thoughts? Would you like to add this functionality to ndarray-linalg?

Yes, this feature is suitable to ndarray-linalg crate.

I have moved ndarray-linalg to rust-math organization to manage with other math crates, but I still think it should be in rust-ndarray org to gather maintenance power.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Multivariate normal distribution in ndarray-rand
5 participants