-
-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Migrating from current dask/dataframe implementation to dask-expr #361
Comments
Thanks @phofl, a couple quick questions: First, is there a list of known incompatibilities (I guess https://github.com/dask-contrib/dask-expr?tab=readme-ov-file#api-coverage, if it's up to date?). It'd be good to include that in the migration guide. Second, on the deprecation warning
Do you have specific wording in mind? Is the intent to deprecate And what's the plan for packaging this up? Will you make dask-expr a required dependency of dask[dataframe] before the switch? That would make it easier for users to adapt, since the dependency would already exist. (I guess this would interact with the plan for future development of dask-expr). It's great to see all the progress here! |
Yes that's up to date
dask.dataframe will stay, we just want to swap out the implementation. dask-expr will probably eventually end up in dask/dask, but for now fast ci times and not that much baggage are more helpful than merging the repositories
That's something that we haven't discussed in detail, but this is not trivial to do since dask-expr requires pandas >= 2, so the dependencies would conflict, We currently raise an error that users have to install dask-expr if they enable query planning and it's not there |
Makes sense, thanks. I think the main thing to keep an eye on is how long we leave Maybe even before it's merged into dask/dask, we could have |
We've been working on adding a Query optimization layer to Dask DataFrames for a while now. The project live at https://github.com/dask-contrib/dask-expr
The status quo can be summarised as follows:
The next step is to think about how we can flip users from the legacy implementation to the expression based implementation.
My suggestion is something along the lines:
This issue is mostly meant to gather feedback about how we should approach this
The text was updated successfully, but these errors were encountered: