Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Friendly import error message for dask-expr #11024

Closed
wants to merge 2 commits into from
Closed

Conversation

benrutter
Copy link
Contributor

@benrutter benrutter commented Mar 26, 2024

Dask dataframe now uses dask-expr by default which needs to be installed seperately.

I think it could potentially trip new users up a little bit if they install the latest dask since they'll hit a message asking them to python -m pip install "dask[dataframe]", but doing this won't resolve the error (assuming they are using dask expressions which is the default).

This PR expands the error message to the following:

Dask dataframe requirements are not installed.

Please either conda or pip install as follows:

    conda install dask                     # either conda install
    python -m pip install "dask[dataframe]" --upgrade  # or python -m pip install

Dask also now uses dask-expr by default, which requires installing seperately:

    conda install dask-expr
    python -m pip install "dask-expr" --upgrade

I haven't created a github issue for this (but can do)

  • Closes #xxxx
  • Tests added / passed
  • Passes pre-commit run --all-files

@phofl
Copy link
Collaborator

phofl commented Mar 26, 2024

dask-expr is a dask[dataframe], is this not getting pulled in automatically?

@benrutter
Copy link
Contributor Author

benrutter commented Mar 26, 2024

No I think this is intentional behavour? (dask.dataframe doesn't pull dask-expr, rather than the other way round)

Looking at this it's possible that it's only if you're updating dask rather than installing fresh? I'll check!

@phofl
Copy link
Collaborator

phofl commented Mar 26, 2024

It's pulled for me if I do

pip uninstall dask-expr
pip install "dask[dataframe]" --update

@benrutter
Copy link
Contributor Author

benrutter commented Mar 26, 2024

Yes, @phofl you're right (my bad)! Just pip installing dask completely fresh into an environment installs dask-expr with it.

I still think a mention in the error message is helpful, since depending on workflow tools, commands like pipenv update, pip update dask or rye sync --update-all.

The downside is it'd need updating again in some future version when this isn't a likely risk though (I think a bunch of people are likely to update dask to the latest version, but there's a point where nobodies bumping versions by more than a year or so and not expecting to hit some kind of issue, especially since there's no semvar in dask)

Edit: looking at your comment again, I'm wondering if it's more obscure that I assumed? I hit this issue running "update all" with Rye, but that's probably not the most common workflow? (I'd guess similar things would happen with workflow tools like poetry or hatch but I haven't put that to the test)

@phofl
Copy link
Collaborator

phofl commented Mar 26, 2024

I don't want to add a reference to dask-expr to the message for now if we can avoid it. dask will consume dask-expr at some point where this won't be necessary anymore and probably mess with users environments

@benrutter benrutter closed this Mar 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants