Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add distribution graphs to documentation #1434

Draft
wants to merge 11 commits into
base: master
Choose a base branch
from

Conversation

MichaelOwenDyer
Copy link
Member

@MichaelOwenDyer MichaelOwenDyer commented Apr 9, 2024

  • Added a CHANGELOG.md entry

Summary

This PR adds some graphs to the documentation of different distributions in rand_distr to illustrate their behavior.

Motivation

Make the distributions in this crate easier to understand via visual aids.

Details

The diagrams will be hosted in https://github.com/rust-random/charts (see this PR) and links will be inserted into the documentation.

While this PR is still in draft, not all links have been added yet.

Closes #131.

Links Progress:

  • binomial::Binomial
  • cauchy::Cauchy
  • dirichlet::Dirichlet
  • exponential::Exp
  • exponential::Exp1
  • frechet::Frechet
  • gamma::Gamma
  • gamma::ChiSquared
  • gamma::FisherF
  • gamma::StudentT
  • gamma::Beta
  • geometric::Geometric
  • geometric::StandardGeometric
  • gumbel::Gumbel
  • hypergeometric::Hypergeometric
  • inverse_gaussian::InverseGaussian
  • normal::Normal
  • normal::StandardNormal
  • normal::LogNormal
  • normal_inverse_gaussian::NormalInverseGaussian
  • pareto::Pareto
  • pert::Pert
  • poisson::Poisson
  • skew_normal::SkewNormal
  • triangular::Triangular
  • unit_ball::UnitBall
  • unit_circle::UnitCircle
  • unit_disc::UnitDisc
  • unit_sphere::UnitSphere
  • weibull::Weibull
  • zipf::Zeta
  • zipf::Zipf

@MichaelOwenDyer
Copy link
Member Author

Need to look into it more, but maybe whatever solution we decide on here could also be used for #278

@dhardy
Copy link
Member

dhardy commented Apr 10, 2024

However, this also adds a transitive dependency to syn, which is not great.

Not an issue: you already feature-gated it.

The diagrams are generated with Python code, which is included in the PR for reference.

Good to have in the same repo, but it doesn't need to be in the generated crate package. Could you try investigating why this happens? We already have an explicit include rule in the Cargo.toml, which seems like it should be enough.

We should also talk about the best place to put the diagrams (currently there is one assets folder for rand and another for rand_distr, but this might not work).

We don't package any of these files except by embedding, so we don't need an assets dir and we don't need these separated by crate, but I prefer that the generated resources are next to the generating script. Having the assets in crate-specific dirs but not the script is especially weird. Maybe just move diagrams/src to assets/src, also splitting by crate?

Or, simpler: forget Bernoulli (doesn't really need it) and put all the rest under rand_distr/plots. (Also solves my nit below.)

Plots

Bernoulli

X-axis values should be false, true; not 0, 1.

Binomial

Can we add a second parameterisation, e.g. n=6, p=0.5?

Cauchy

👍 (straight copy from WP, I see)

@newpavlov
Copy link
Member

I think SVG would be a better fit (separate files, not embedded into HTML). You should be able to generate the same plots with Python without issues.

@dhardy
Copy link
Member

dhardy commented Apr 10, 2024

@newpavlov you missed the discussions in #131 (specifically, how to make this work on docs.rs).

SVG appears larger in some cases, smaller in others. It does scale better.

@newpavlov
Copy link
Member

newpavlov commented Apr 10, 2024

Ah, I misread the discussion and thought it was about embedding SVG into HTML. I don't think the terrible diff objection is important, plus we probably should use compression either way (i.e. use SVGZ instead of plain SVG). Though there may be some issues with browser support of SVGZ files.

In RustCrypto we use links to images which reside in the RustCrypto/media repository, e.g. see docs for the ctr crate. It means that we rely on raw.githubusercontent.com instead of packaging the image together with crate, but since the images are purely for illustrative purposes, I think it's a reasonable risk to take.

@MichaelOwenDyer
Copy link
Member Author

Good to have in the same repo, but it doesn't need to be in the generated crate package. Could you try investigating why this happens? We already have an explicit include rule in the Cargo.toml, which seems like it should be enough.

I think it may be because the folder was named src. I've moved it and renamed it to py, could you please check again?

forget Bernoulli (doesn't really need it) and put all the rest under rand_distr/plots

Got it. That also removes the need for the added dependency on rand, now it only remains in rand_distr.

Binomial
Can we add a second parameterisation, e.g. n=6, p=0.5?

I went ahead and added n=10, p=0.6 because I think it looks better next to n=10, p=0.2. If you think having a demonstration of a different n is important, then I could experiment some more. Maybe something like n=20, p=0.5 would look better.

I think SVG would be a better fit (separate files, not embedded into HTML). You should be able to generate the same plots with Python without issues.

I prefer SVG as well. However, I just did the switch and the diff count exploded to +11,000. Wowza.

@newpavlov
Copy link
Member

newpavlov commented Apr 10, 2024

The Python script folder should probably be explicitly excluded from packaged crate (check if it gets included by running cargo package).

It also may be worth to consider the separate repo approach described above. It will be a simpler solution than relying on embed-doc-image and will result in a smaller package size. I will leave the final decision to @dhardy.

@MichaelOwenDyer
Copy link
Member Author

The Python script folder should probably be explicitly excluded from packaged crate (check if it gets included by running cargo package).

Ah, thanks for the tip! According to this include and exclude should not be used at the same time. But I found the problem, had to change src/ to /src :)

It also may be worth to consider the separate repo approach described above. It will be a simpler solution than relying on embed-doc-image and will result in a smaller package size. I will leave the final decision to @dhardy.

Certainly. I'm not attached to embedding them if we find a better solution.

@dhardy
Copy link
Member

dhardy commented Apr 10, 2024

I prefer SVG as well. However, I just did the switch and the diff count exploded to +11,000. Wowza.

Diffs normally ignore binary files; this is just because SVG is text based? If so, switching to SVGZ may avoid this?

had to change src/ to /src

Aha!

It also may be worth to consider the separate repo approach described above.

The only real issue is versioning. I see usage of commit hashes in the URLs that ctr uses; this or tags works.

There is also the minor issue that repo clones don't include all docs, but we already have that with the book.

The other issue is that anyone trying to make themselves a copy of the docs for offline usage likely won't have the plots, but that's not critical.

Considering these plots aren't directly tied to the Rust code in any way, a separate repo probably is the way to go.

@MichaelOwenDyer
Copy link
Member Author

Diffs normally ignore binary files; this is just because SVG is text based? If so, switching to SVGZ may avoid this?

Yes, SVGZ is better in this regard. However embed-doc-image doesn't support it.

a separate repo probably is the way to go.

Okay then! Want to create it?

@dhardy
Copy link
Member

dhardy commented Apr 10, 2024

Okay then! Want to create it?

https://github.com/rust-random/charts

@MichaelOwenDyer
Copy link
Member Author

Can you authorize me to push in that repo, or should I make a fork?

@dhardy
Copy link
Member

dhardy commented Apr 10, 2024

@MichaelOwenDyer you should already have admin privileges on that repo.

@MichaelOwenDyer
Copy link
Member Author

MichaelOwenDyer commented Apr 10, 2024

Hm, strange, I am getting a 403 whenever I try to push.

Edit: Nevermind, I logged in using a GitHub token and then it worked.

@vks
Copy link
Collaborator

vks commented Apr 26, 2024

Diffs normally ignore binary files; this is just because SVG is text based?

We can tell Git via attributes to treat the files as binary if necessary.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Generate or include graphs for distributions in docs
4 participants