-
-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fully support partial functions as aggregations #9615
Comments
Thanks for the report, @ChrisJar. I can replicate this on the latest release of dask and distributed. To be crystal clear, I modified your reproducer slightly to put the discrepancies front-and-center: import pandas as pd
import numpy as np
import dask.dataframe as dd
from functools import partial
df = pd.DataFrame({
"a": [5, 4, 3, 5, 4, 2, 3, 2],
"b": [1, 2, 5, 6, 9, 2, 6, 8],
})
ddf = dd.from_pandas(df, npartitions=1)
pd.concat([
df.groupby("a").agg(partial(np.std, ddof=1))["b"],
df.groupby("a").agg(partial(np.std, ddof=-2))["b"],
ddf.groupby("a").agg(partial(np.std, ddof=1))["b"].compute(),
ddf.groupby("a").agg(partial(np.std, ddof=-2))["b"].compute(),
], axis=1, keys=[ ("pandas", "ddof=1"), ("pandas", "ddof=-2"), ("dask", "ddof=1"), ("dask", "ddof=-2")])
|
I double checked this wasn't a problem with choosing ddof=-2, because I'm not sure in which case you will have
That being said, I agree that this is a bug on passing the args to the partial. I replicated this with ddof=0 and 1.
|
May be able to close this via #9724. |
Thanks @j-bennet. For some reason, the issue didn't close itself when the PR is merged. Closing now!! |
Currently in
dask.dataframe
, Dask ignores any arguments given to a function in apartial
when performing an aggregation.For example:
returns
whereas in pandas:
returns:
The discrepancy is because Dask ignores the
ddof
argument and defualts toddof=1
. It'd be great if there were some way for Dask to recognize and take those arguments into account like Pandas does.The text was updated successfully, but these errors were encountered: