Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generate or include graphs for distributions in docs #131

Open
Tracked by #1432
abonander opened this issue Dec 15, 2016 · 12 comments · May be fixed by #1434
Open
Tracked by #1432

Generate or include graphs for distributions in docs #131

abonander opened this issue Dec 15, 2016 · 12 comments · May be fixed by #1434
Assignees
Labels
C-docs Documentation E-help-wanted Participation: help wanted

Comments

@abonander
Copy link

Those of us who aren't as well versed in probability theory (and that includes me), may not be able to immediately intuit the main properties of the distribution functions from their density functions.

I'm talking about something like this graph that Wikipedia has for the normal/gamma distribution article, either linked or inlined in the docs for the various relevant types in the distributions module.

These graphs are probably available online under compatible licenses already, possibly in the whitepapers linked under the various types in this module. I haven't looked yet, but I don't think we can extract these graphs anyways as they are probably unlicensed or their copyright terms are not compatible with MIT/Apache-2.0.

@jacwah
Copy link
Contributor

jacwah commented Jan 9, 2017

I think this is a great idea and am currently looking into possible solutions.

@jacwah
Copy link
Contributor

jacwah commented Jan 12, 2017

First, I tried searching for some graphs available online. Quickly I realised a few problems.

  1. Most graphs available online don't have a compatible license (Creative Commons etc).
  2. I couldn't find a complete set of probability density graphs in the same style.
  3. They aren't tailored to this project's needs.

I think the graphs should be

  • Reproducible
  • Editable
  • Extendable
  • Coherent

Therefore I decided to try writing a script myself using Python. Matplotlib has great graphing support and Scipy has a wide range of probability density functions out of the box.

Chi-squared probability density

This was generated from the following code:

import numpy as np
from scipy.stats import chi2
from graphbutler import recipe, save_all, Graph, Parameterized

@recipe
def chi_squared_pdf():
    g = Graph()
    g.title = "Chi-squared probability density"
    g.x = np.arange(0.0, 9.0, 0.01)

    def y(k):
        y = chi2.pdf(g.x, k)
        # Value threshold because k=1 is unbounded
        y[y > 0.5] = np.nan
        return y

    g.y = Parameterized("k", y, (1, 2, 3, 5, 9))
    return g

if __name__ == "__main__":
    save_all(format="png")

Graphbutler is a simple wrapper around matplotlib I wrote to reduce boilerplate code.

I could write a recipe for each of the distributions in this crate. It should then be easy to tweak and extend with potential new distributions.

It's currently hard to include local images in rustdoc (issue). Therefore, I suggest the rendered graphs are hosted on the internet instead of in the crate itself.

What do you think?

@jacwah
Copy link
Contributor

jacwah commented Jan 12, 2017

Then there are cumulative distribution functions. Are those interesting as well? Personally, I find them harder to grasp.

@erickt
Copy link
Contributor

erickt commented Jan 18, 2017

@steveklabnik: this sounds like a docs-related thing. Do you have a good approach to getting these kinds of generated graphs in our docs?

@alexcrichton alexcrichton added the C-docs Documentation label Jun 14, 2017
@dhardy dhardy added the E-help-wanted Participation: help wanted label Apr 5, 2024
@MichaelOwenDyer MichaelOwenDyer self-assigned this Apr 5, 2024
@MichaelOwenDyer
Copy link
Member

MichaelOwenDyer commented Apr 5, 2024

@dhardy Do you have any opinions regarding the best place to put these graphs? From reading into the conversation above these options are immediately apparent to me:

  1. Store them in a folder in the git repo (which would significantly increase clone size)
  2. Host them somewhere else and just put web links in the documentation. This wouldn't affect clone size, but would introduce coupling between the docs and an external host which would need to be maintained, also an internet connection would be required to view them at any time.
  3. Embed them directly into the docs (either Base64 encoded or in SVG format, which I just learned is valid HTML 🤠). This might be a good solution depending on just how much bloat it would add to the docs (even if it rendered nicely, hundreds of lines of gobbledygook in nearly every file would be sad indeed).

Do you know of any best practices here, or of any other crates that have done something like this which we could take inspiration from?

Also, any opinions about storing the (presumably Python) code which generates these diagrams, perhaps for future use or reproducibility?

@dhardy
Copy link
Member

dhardy commented Apr 5, 2024

In my opinion, documentation should be stored within the repository, and these are documentation. The exception is things like tutorials and books which are more prose.

So, I can see a few possible options:

  1. We just link to external documentation such as Wikipedia. We already do this in some cases. I've nothing against the links, but I guess this issue is about having our own resource.
  2. We expand the book with more information on our distributions (like GSL or Python), also including graphs. This lets us group related distributions and include more prose than in-line docs while giving us a uniform look at distributions.
  3. We add plots to API documentation. More scrolling will be needed to see the API, but as long as it's short I guess that's fine.

If we go for API docs, we should add a sub-dir in the repo like rand_distr/res/plots. Either way, I personally think scalable (SVG) graphics are preferable for this type of thing; example.

@dhardy
Copy link
Member

dhardy commented Apr 5, 2024

Embed them directly into the docs (either Base64 encoded or in SVG format, which I just learned is valid HTML 🤠). This might be a good solution depending on just how much bloat it would add to the docs (even if it rendered nicely, hundreds of lines of gobbledygook in nearly every file would be sad indeed).

Also horrible for diffs any time a plot is edited. No thanks!

@dhardy
Copy link
Member

dhardy commented Apr 5, 2024

Also, any opinions about storing the (presumably Python) code which generates these diagrams, perhaps for future use or reproducibility?

If it's a tiny amount of code, then in the same repository (we probably also want to store the output rather than require the build job regenerate them, although minimalism says otherwise).

If it's a lot of code, we can use a new repository under https://github.com/rust-random/

@MichaelOwenDyer
Copy link
Member

I'm working on a PR right now. I wrote some Python code with numpy, scipy, and matplotlib to generate diagrams for different distributions (roughly 20 lines of code per distribution so far) and now I'm trying to use the embed-doc-image crate to inject these images as Base64 into the documentation during compilation. This last step is proving somewhat tricky, it seems the macro doesn't expand to include a div like it should. Still investigating, might have to open a PR in that crate first

@dhardy
Copy link
Member

dhardy commented Apr 8, 2024

I would suggest simply linking the images instead of embedding. The catch is that building docs will then require copying these into target/doc in local builds and in .github/workflows/gh-pages.yml.

@MichaelOwenDyer
Copy link
Member

MichaelOwenDyer commented Apr 8, 2024

So, do you mean something like this rust-lang/rust#32104 (comment)? I think it would be really nice if the diagrams would be viewable on docs.rs as well, and if I understand correctly then that would not be possible without embedding. But to be honest I'm open to just putting the images into a directory, referencing them in the docs, and calling it a day. This certainly isn't the most streamlined thing to do with rustdoc.

@dhardy
Copy link
Member

dhardy commented Apr 9, 2024

You're right; it's not quite so simple for docs.rs. We could possibly get around this by copying image resources into $OUT_DIR/doc using build.rs and use relative links (assuming docs.rs packages everything in doc; I don't know).

The issue you linked discusses some other options. embed_doc_image may be a good choice? Sorry, you already mentioned this...

@MichaelOwenDyer MichaelOwenDyer linked a pull request Apr 9, 2024 that will close this issue
33 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-docs Documentation E-help-wanted Participation: help wanted
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants