Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DOC: stats: resampling and Monte Carlo methods tutorial #16699

Merged
merged 12 commits into from Sep 8, 2022

Conversation

mdhaber
Copy link
Contributor

@mdhaber mdhaber commented Jul 25, 2022

Reference issue

What does this implement/fix?

This is a draft of the first part a scipy.stats resampling and Monte Carlo methods tutorial. I had a little fun with the introduction; the rest is pretty formal. @mckib2 is slated to review.

Additional information

Can't build docs locally. Fingers crossed.

@mdhaber mdhaber added scipy.stats Documentation Issues related to the SciPy documentation. Also check https://github.com/scipy/scipy.org labels Jul 25, 2022
@mdhaber
Copy link
Contributor Author

mdhaber commented Jul 25, 2022

@tirthasheshpatel how did you write the scipy.stats.sampling tutorials? I drafted these (and several others not yet pushed) in Jupyter notebooks, and in this PR, I'm experimenting with automatic conversion to restructured text. jupyter nbconvert does most of it, but there a lot of things I'd have to adjust manually after the export. This might be OK if it only had to be done once, but these should be living documents, improved over time. It doesn't seem useful to manually convert these from a notebook, which a user could interact with, to a static web page, which a user can't even (easily) copy code from. Any suggestions for the workflow? Or what do you think of offering notebooks for download or linking to an external notebook (e.g. on Colab)?

@tupui
Copy link
Member

tupui commented Jul 28, 2022

Nice 👍 just a fly by comment for now (as you said you wanted Nicholas to review first). Don't forget to remove the global seeding and use a generator. Also I am thinking that we could reorganise the existing unuran, QMC and this. There are common things at least in the introduction (e.g. in QMC there is a description of MC vs QMC) and we should probably have QMC as a standalone page as you are doing here.

@tirthasheshpatel
Copy link
Member

tirthasheshpatel commented Jul 28, 2022

how did you write the scipy.stats.sampling tutorials?

I wrote them directly in ReST. You can also try using pandoc to convert Jupyter Notebooks to (github-flavored) markdown (or HTML) first and jupyter-nbconvert for markdown (or HTML) to ReST. Pandoc usually does a great job at converting jupyter notebooks one-to-one and conversion from markdown (or HTML) to ReST is easy.

In case of Jupytest Notebooks though, I think it would be better to add all the notebooks in a new directory called notebooks and link them in tutorials or add a link in tutorials to open them directly in nbviewer. If we don't want notebooks in the main repo itself, we could even put them in a different repo under the SciPy org. Does that work? Or do we want the notebooks themselves as tutorials?

simple Monte Carlo approach was more accurate than the normal
approximation. This is not uncommon: when an exact answer is unknown,
often a computational approximation is more accurate than an analytical
approximation. Also, it’s easy for demons to invent questions for which
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another nitpick: I've had trouble with these special characters when using simple text editors on Linux

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Must have happened during automatic conversion from the notebook.

@mckib2
Copy link
Contributor

mckib2 commented Jul 28, 2022

@mdhaber This is looking really good to me, I just added some nitpicks and pointed out another usage np.random. Ignore the nits you don't agree with, but I think a few of them are good revisions

@mdhaber
Copy link
Contributor Author

mdhaber commented Jul 28, 2022

Thanks @mckib2. All of your comments are good.

Any opinions on what format these should be in? Converting these to restructured text has been a bit of a pain, and I don't think they're as useful as if they were notebooks. On the other hand, it's much nicer to work with human-readable text with git. The way matplotlib and scikit-learn do their examples might be a good compromise? Anyway, do you have ideas?

@mdhaber
Copy link
Contributor Author

mdhaber commented Jul 28, 2022

@tirthasheshpatel

Does that work? Or do we want the notebooks themselves as tutorials?

That might be a great compromise. I was thinking that maybe the resmpling.rst would be the only rst, and it could link to the rest of the notebooks. Opening in nbviewer is a great idea. I'll try that.

As for whether they are part of SciPy or are in a separate repo, I don't really have an opinion.

[skip azp] [skip actions]
[skip actions] [skip azp]
[skip actions] [skip azp]
@mdhaber
Copy link
Contributor Author

mdhaber commented Jul 28, 2022

OK, I think I implemented @tirthasheshpatel's suggestion to add the notbooks to a new folder and link to them from the original starting point resampling.rst file. Let's see how it looks. (Note I've used my branch in the link URLs for now so that they will link properly while the PR is being reviewed. We'd need a follow-up PR to change the links to the SciPy repo.)

If we don't want notebooks in the main repo itself, we could even put them in a different repo under the SciPy org

This might be better because currently when you try to execute the notebooks in Binder (from nbviewer), the build succeeds but from scipy import ... raises an import error. If we created a separate repo, it could have very lean requirements and maybe open in Binder faster as a result?

BTW I know I need to check that citations are numbered sequentially throughout and add references at the bottom of each page. There were also some comments I tried to add as footnotes that look wonky right now. I'm waiting until we figure out exactly how these are being published before I spend time on this.

@tirthasheshpatel
Copy link
Member

If we created a separate repo, it could have very lean requirements and maybe open in Binder faster as a result?

Yes, a separate repo sounds better.

Copy link
Contributor Author

@mdhaber mdhaber left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Trying to improve rendering.
image

doc/source/tutorial/stats/resampling.rst Show resolved Hide resolved
doc/source/tutorial/stats/resampling.rst Show resolved Hide resolved
doc/source/tutorial/stats/resampling.rst Outdated Show resolved Hide resolved
doc/source/tutorial/stats/resampling.rst Outdated Show resolved Hide resolved
doc/source/tutorial/stats/resampling.rst Outdated Show resolved Hide resolved
[skip azp] [skip actions]
@mdhaber
Copy link
Contributor Author

mdhaber commented Jul 29, 2022

@tirthasheshpatel et al. here's the plan:

@melissawm @rossbar Can you help us set up those sort of tutorials for SciPy? This is exactly what we were looking for, I think.

@tupui
Copy link
Member

tupui commented Jul 29, 2022

@tirthasheshpatel et al. here's the plan:

@melissawm @rossbar Can you help us set up those sort of tutorials for SciPy? This is exactly what we were looking for, I think.

+1 on doing the same as NumPy. This looks very nice and with all the features we want I think.

Fine for now until we have the new repo/setup. After which we should think about moving over what we want from the cookbook and archiving the repo.

[skip azp] [skip actions]
@ev-br
Copy link
Member

ev-br commented Jul 31, 2022

As notebooks are discussed, linking a stale issue #5233

@rgommers
Copy link
Member

scipy-cookbook indeed seems like a reasonable intermediate solution. For the rest, let's use gh-5233 to discuss in order to keep this PR focused on the added content?

@mckib2
Copy link
Contributor

mckib2 commented Sep 8, 2022

Looks like a comments are resolved and the plan to link to cookbooks is generally accepted. In it goes -- thanks @mdhaber, apologies for the delay in review

@mckib2 mckib2 merged commit 33ffa8c into scipy:main Sep 8, 2022
@mdhaber mdhaber added this to the 1.10.0 milestone Sep 16, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Documentation Issues related to the SciPy documentation. Also check https://github.com/scipy/scipy.org scipy.stats
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants