Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DOC: added Raises section to a few functions in _stats_py.py #20496

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

Schefflera-Arboricola
Copy link

I was going through the docs of the pearsonr function which led me to add the Raises sections to some of the functions in the file. Let me know if this LGT you.

Thank you :)

@github-actions github-actions bot added scipy.stats Documentation Issues related to the SciPy documentation. Also check https://github.com/scipy/scipy.org labels Apr 16, 2024
Copy link
Member

@tupui tupui left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @Schefflera-Arboricola for the PR.

We are not very consistent with this and don't really have a policy. Would you mind posting a message on our forum to get others opinion? https://discuss.scientific-python.org/c/contributor/scipy that could help 😃 I am personally +1 as it's then more complete.

@fancidev
Copy link
Contributor

Nice additions!

I find the ones that specify the minimum number of elements required particularly useful because they are not obvious. The ones about argument domains are also nice to add.

Some of the additions about types or broadcast-ability may not be essential, because the dynamic typing nature of Python ensures you can never exhaust the valid inputs or possible exceptions. Therefore you can only do so much to remind a conscious programmer and wish the unconscious good luck!

Copy link
Member

@rgommers rgommers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @Schefflera-Arboricola. In general I think adding information under Raises that is specific to a particular algorithm can be helpful. What should be avoided is adding information that is generally expected from Python libraries (e.g., TypeError if one passes in the wrong type) or from standard design patterns (e.g., axis should be integer).

There is a bit of judgement needed when it's not clear cut. I'd suggest to only document things when there is a clear need. And not adding lots of docs for every last bit of argument validation in every function, because that is more distraction and churn than helpful.

Raises
------
TypeError
If `a` is not numeric.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one is not desirable. It's too generic; a is very clearly documented as a numerical array already.

Raises
------
TypeError
If `center` is not callable.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not desirable. If it's an argument document as callable, and you pass in the wrong type then TypeError is the standard expectation in Python code.

Raises
------
ValueError
If `axis` is not an integer, `x` and `y` do not have the same length
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not desirable to add this. axis keywords are everywhere.

Raises
------
ValueError
If shape of `a` along the given `axis` is less than 8.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an example of where it is helpful information I think, because this is specific to skewtest and the error type isn't completely obvious based on the "less than 8" info in Notes.

Copy link
Contributor

@mdhaber mdhaber Apr 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we document this, can we remove the error in the future? I can see the statement instead becoming "... is less than ___, a Monte Carlo version of the test is performed." when s method option is added and the default becomes 'auto'". Also, when there is nan_policy='omit', it seems undesirable to raise an error when just one slice has fewer than the required number of elements. Many other stats functions just document these requirements and return NaN for the affected element when there is a problem.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see a problem with removing the error in the future.

And agreed to 'omit': returning nan seems reasonable. Going from error to non-error (nan or whatever) should always be fine, the reverse is typically not fine.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Documentation Issues related to the SciPy documentation. Also check https://github.com/scipy/scipy.org scipy.stats
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants