New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Filter out numeric_only
warnings from pandas
#9496
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, so I think this is ready for a review now. The goal of this PR is to filter out all of the numeric_only
deprecation warnings that will start being emitted in pandas=1.5
. Note that this doesn't resolve #9471 -- it's just a bandaid. However, given pandas=1.5
is scheduled for release on Monday, I think this is an okay bandaid to apply. We will need to go back and more fully support the numeric_only
changes in follow-up though.
numeric_only
warnings from pandas
numeric_only
warnings from pandas
axis=axis, | ||
skipna=skipna, | ||
split_every=split_every, | ||
out=out, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the min
and max
cases we are not passing numeric_only, because for these cases are part of " that users get a deprecation messages if (a) the defaults are used and (b) the result would change based on the new default (False) in 2.0." ? citing this comment pandas-dev/pandas#46560 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Locally I saw strange test failures when passing numeric_only
here. Additionally, we shouldn't need to pass numeric_only
through as the @_numeric_only
decorator should handle selecting only numeric data for us automatically
@@ -2348,6 +2402,7 @@ def std( | |||
skipna=skipna, | |||
ddof=ddof, | |||
enforce_metadata=False, | |||
numeric_only=numeric_only, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a comment for line 2680.
kurtosis_shape = num._meta_nonempty.values.var(axis=0).shape
since we are checking the meta_nonempty, don't we need the check_numeric_only_deprecation here too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we're using .values
to get a NumPy array we shouldn't trigger the warning from pandas
@@ -2671,7 +2731,7 @@ def sem(self, axis=None, skipna=True, ddof=1, split_every=False, numeric_only=No | |||
result.divisions = (self.columns.min(), self.columns.max()) | |||
return result | |||
|
|||
def quantile(self, q=0.5, axis=0, method="default"): | |||
def quantile(self, q=0.5, axis=0, numeric_only=True, method="default"): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at pandas to check for consistency. this default will be a no_default in 1.5, and it says in the future it'll be False. see https://pandas.pydata.org/docs/dev/reference/api/pandas.DataFrame.quantile.html
Do we need to warn here too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's not warn here now as numeric_only=False
isn't supported in Dask today. Let's handle this in follow-up work though
@@ -1216,7 +1248,7 @@ def test_reductions_frame_dtypes_numeric_only(): | |||
} | |||
) | |||
|
|||
ddf = dd.from_pandas(df, 3) | |||
ddf = dd.from_pandas(df, 1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this cosmetic, or there is a reason to only have one partition?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, good catch. I changed npartitions
to 1
while debugging, but forgot to change it back. Will fix that
Leaving this here for reference pandas-dev/pandas#46560 |
Thanks for reviewing @ncclementi! As a side note, @ian-r-rose mentioned offline that he was fine with the changes here but thought it would be nice to be more explicit about passing the name of the method called in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @jrbourbeau!
Looks good to me, everything is passing except for other known upstream issues that are in flight in other PRs |
This PR is a temporary workaround for #9471.