Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

raise ImportError instead of ValueError when dask-expr cannot be imported #11007

Merged
merged 1 commit into from
Mar 25, 2024

Conversation

jameslamb
Copy link
Member

#10634 introduced an import-time check that ensures that a statement like this:

import dask.dataframe

will raise an exception under the following conditions:

  • a sufficiently-new version of pandas exists in the environment
  • Dask config "dataframe.query-planning" is True
  • dask_expr cannot be imported

That exception is currently a ValueError. This proposes changing it to an ImportError.

Benefits of this change

Allows the use of this pattern to conditionally import dask.dataframe:

try:
    import dask.dataframe
except ImportError:
    # ... do something if loading dask.dataframe failed ...

That's the same pattern dask itself uses to conditionally import dependencies, for example:

try:
from dask.dataframe.io import read_parquet, to_parquet
except ImportError:
pass

And I think it's common enough across other Python projects that Dask should support it unless there's a compelliing reason to use ValueError instead. Some examples:

Notes for Reviewers

I think (but don't know with certainty) that this change would help other projects that rely on dask.dataframe but aren't yet supporting dask-expr. It might help with dask/dask-expr#904, for example.

I'm certain it would help lightgbm. dask is an optional dependency of that library, but as of dask==2024.3.1, this use of ValueError would break import lightgbm even if you do not intend to use any of LightGBM's Dask features.

docker run \
    --rm \
    -v $(pwd):/opt/LightGBM \
    -w /opt/LightGBM \
    -it python:3.11 \
    bash

pip install \
    'dask==2024.3.1' \
    'lightgbm>=4.3.0' \
    'pandas==2.2.1' \
    'pyyaml'

DASK_DATAFRAME__QUERY_PLANNING=true \
python -c "import lightgbm"
traceback (click me)
Traceback (most recent call last):
  File "/usr/local/lib/python3.11/site-packages/dask/dataframe/__init__.py", line 22, in _dask_expr_enabled
    import dask_expr  # noqa: F401
    ^^^^^^^^^^^^^^^^
ModuleNotFoundError: No module named 'dask_expr'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/usr/local/lib/python3.11/site-packages/lightgbm/__init__.py", line 8, in <module>
    from .basic import Booster, Dataset, Sequence, register_logger
  File "/usr/local/lib/python3.11/site-packages/lightgbm/basic.py", line 21, in <module>
    from .compat import (PANDAS_INSTALLED, PYARROW_INSTALLED, arrow_cffi, arrow_is_floating, arrow_is_integer, concat,
  File "/usr/local/lib/python3.11/site-packages/lightgbm/compat.py", line 155, in <module>
    from dask.dataframe import DataFrame as dask_DataFrame
  File "/usr/local/lib/python3.11/site-packages/dask/dataframe/__init__.py", line 96, in <module>
    if _dask_expr_enabled():
       ^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/dask/dataframe/__init__.py", line 33, in _dask_expr_enabled
    raise ValueError(msg)
ValueError: 
Dask dataframe query planning is disabled because dask-expr is not installed.

You can install it with `pip install dask[dataframe]` or `conda install dask`.
This will raise in a future version.

Mentioning #10995 here as well, to link this to that issue that is collecting feedback on the dask-expr rollout.

Thanks very much for your time and consideration.

Copy link
Contributor

Unit Test Results

See test report for an extended history of previous test failures. This is useful for diagnosing flaky tests.

     15 files  ±0       15 suites  ±0   3h 16m 55s ⏱️ - 12m 34s
 13 105 tests ±0   12 159 ✅ ±0     930 💤 ±0  16 ❌ ±0 
162 239 runs  ±0  142 082 ✅ ±0  20 141 💤 ±0  16 ❌ ±0 

For more details on these failures, see this check.

Results for commit 1b7d29d. ± Comparison against base commit f9310c4.

Copy link
Collaborator

@phofl phofl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sounds good

@phofl phofl merged commit 83c204c into dask:main Mar 25, 2024
27 of 29 checks passed
@jameslamb jameslamb deleted the dask-expr-import-error branch March 25, 2024 18:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants