Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DOC: stats.linregress: split stats/mstats documentation #20547

Merged
merged 1 commit into from May 3, 2024

Conversation

mdhaber
Copy link
Contributor

@mdhaber mdhaber commented Apr 21, 2024

Reference issue

NA

What does this implement/fix?

The documentation of scipy.stats.linregress incorrectly suggests that masked arrays are supported:

Missing values are considered pair-wise: if a value is missing in x, the corresponding value in y is masked.

Masks are silently ignored by this function, so this PR removes the statement. This PR also corrects a typo about the first argument being a 2x2 array; it meant 2xN array, where N is the number of observations.

Also, the documentation of scipy.stats.mstats.linregress currently looks like:

image

suggesting that the docstring was supposed to be replaced automatically. I'm not aware of any machinery to do that, so I copied the scipy.stats.linregress documentation over and changed the example to use stats.mstats.linregress instead of stats.linregress.

Additional information

I decided against adding a mask to the example in the mstats version. For one thing, the mstats version of the function offers little value since it simply calls the stats version after compressing the arrays, so it is not worth much time to maintain it. Also, if nobody noticed that the documentation was missing entirely, it is unlikely that anyone would appreciate an improved example.

If we want scipy.stats.linregress to be translated to array-API, I think we should deprecate the y=None behavior and make y a required argument. It's not strictly required, but I think it would be confusing to preserve this behavior when there can be arrays of arbitrary dimensionality. Please indicate whether you agree, and I'll make the change in a separate PR.

@mdhaber mdhaber added scipy.stats Documentation Issues related to the SciPy documentation. Also check https://github.com/scipy/scipy.org maintenance Items related to regular maintenance tasks labels Apr 21, 2024
@mdhaber mdhaber requested a review from melissawm April 23, 2024 19:31
Copy link
Contributor

@melissawm melissawm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes look good to me, thanks @mdhaber

For reference, this is what we do in NumPy to reuse the docstrings: https://github.com/numpy/numpy/blob/f8392ce66e411c70bb97858412da02ab6e369a30/numpy/ma/core.py#L5139

@mdhaber
Copy link
Contributor Author

mdhaber commented Apr 24, 2024

Thanks @melissawm. Right, I've modified docs like that before, but there are some small differences, so we don't want to have exact copies. In this case, I don't the complexity of modifying one version of the docs (e.g. with the help of docscrape) is justified.

@melissawm
Copy link
Contributor

That sounds good, thanks!

@mdhaber mdhaber requested a review from melissawm April 29, 2024 05:51
@melissawm melissawm added this to the 1.13.1 milestone May 3, 2024
@melissawm melissawm merged commit 9c9122a into scipy:main May 3, 2024
31 checks passed
@melissawm
Copy link
Contributor

Thank you, @mdhaber !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Documentation Issues related to the SciPy documentation. Also check https://github.com/scipy/scipy.org maintenance Items related to regular maintenance tasks scipy.stats
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants