Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: Fix issues with numeric_only deprecation #47481

Merged
merged 5 commits into from
Jun 23, 2022

Conversation

rhshadrach
Copy link
Member

@rhshadrach rhshadrach commented Jun 23, 2022

Part of #46560

ref: #47265 (comment)

Issues were introduced as part of 1.5, no need for a whatsnew note.

@rhshadrach rhshadrach added Bug Groupby Nuisance Columns Identifying/Dropping nuisance columns in reductions, groupby.add, DataFrame.apply labels Jun 23, 2022
@rhshadrach rhshadrach added this to the 1.5 milestone Jun 23, 2022
@rhshadrach rhshadrach marked this pull request as draft June 23, 2022 02:20
Comment on lines 449 to 454
warnings.warn(
f"Passing `numeric_only=True` to {type(self).__name__}.{how} is "
f"deprecated and will raise in a future version of pandas.",
category=FutureWarning,
stacklevel=find_stack_level(),
)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the exception of rank, all Series ops fail when passing numeric_only=True (even when the Series is numeric) with a NotImplementedError. However most SeriesGroupBy ops allow it. SerisGroupBy ops will be successful when they can handle object dtype or the dtype is numeric. They will raise an error attempting to perform the op when dtype cannot be operated on.

I'm thinking we should make SeriesGroupBy consistent with Series by raising whenever numreic_only is truthy.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I first added the deprecation, I thought other SeriesGroupBy ops would always raise when passing numeric_only=True. Seeing now that's not the case, going to siphon this off into a followup.

The ops std, sem, and quantile all gained the numeric_only argument as part of 1.5.

@rhshadrach rhshadrach marked this pull request as ready for review June 23, 2022 02:57
method = getattr(gb, groupby_func, None)

try:
_ = method(*args)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to confirm, all groupby_func not expected to fail so that's why try/except is used here and makes it harder to use pytest.raises?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i.e. can this be tested like

if groupby_func in methods_ok:
    method(*args)
elif groupby_func in methods_raise_typeerror:
    with pytest.raises(TypeError, ...) as err:
        ...
elif groupby_func in methods_raise_valueerror:
    with pytest.raises(ValueError, ...) as err:
        ...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, none are successful. Some would be if the dtype was not object, but I plan to deprecate that in a follow up assuming no objection. Changed over to using pytest.raises.

@mroeschke mroeschke merged commit 5d880ea into pandas-dev:main Jun 23, 2022
@mroeschke
Copy link
Member

Thanks @rhshadrach

@rhshadrach rhshadrach deleted the numeric_only_fixups branch June 24, 2022 18:23
yehoshuadimarsky pushed a commit to yehoshuadimarsky/pandas that referenced this pull request Jul 13, 2022
* BUG: Fix issues with numeric_only deprecation

* improve test

* Change raising to FutureWarning

* revert deprecation

* Test improvement
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Groupby Nuisance Columns Identifying/Dropping nuisance columns in reductions, groupby.add, DataFrame.apply
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants