Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RecursionError when using PerformanceReport context manager #8578

Open
jinmannwong opened this issue Mar 13, 2024 · 4 comments
Open

RecursionError when using PerformanceReport context manager #8578

jinmannwong opened this issue Mar 13, 2024 · 4 comments

Comments

@jinmannwong
Copy link

Describe the issue:

When executing certain custom task graphs with the PerformanceReport context manager I get log warnings like the following:

2024-03-13 14:43:10,263 - distributed.sizeof - WARNING - Sizeof calculation failed. Defaulting to -1 B
Traceback (most recent call last):
  File ".../site-packages/distributed/sizeof.py", line 17, in safe_sizeof
    return sizeof(obj)
  File ".../site-packages/dask/utils.py", line 773, in __call__
    return meth(arg, *args, **kwargs)
  File ".../site-packages/dask/sizeof.py", line 96, in sizeof_python_dict
    + sizeof(list(d.values()))
  File ".../site-packages/dask/utils.py", line 773, in __call__
    return meth(arg, *args, **kwargs)
  File ".../site-packages/dask/sizeof.py", line 59, in sizeof_python_collection
    return sys.getsizeof(seq) + sum(map(sizeof, seq))
  File ".../site-packages/dask/utils.py", line 773, in __call__
    return meth(arg, *args, **kwargs)

which repeats until it finally ends with

  File ".../site-packages/dask/sizeof.py", line 59, in sizeof_python_collection
    return sys.getsizeof(seq) + sum(map(sizeof, seq))
RecursionError: maximum recursion depth exceeded

The computation still completes correctly and this problem doesn't arise when executing without the performance report.

Minimal Complete Verifiable Example:

This is a small example code that reproduces the problem, where I am using the xarray data from https://github.com/pydata/xarray-data/blob/master/rasm.nc.

from dask.distributed import Client, performance_report
import xarray as xr

dask_graph = {"source": (xr.load_dataset, "rasm.nc")}
with Client() as client:
    with performance_report(filename="dask-report.html"):
        client.get(dask_graph, "source")

Environment:

  • Dask version: 2024.2.0
  • Python version: 3.10
  • Operating System: Linux
  • Install method (conda, pip, source): pip
@jrbourbeau
Copy link
Member

Thanks for the report @jinmannwong. Unfortunately I'm not able to reproduce with the following steps:

# Create a fresh software environment with the specified version of `dask`
$ mamba create -n test python=3.11 dask=2024.2.0 xarray netcdf4
$ mamba activate test
$ python test.py

where test.py is:

from dask.distributed import Client, performance_report
import xarray as xr

dask_graph = {"source": (xr.load_dataset, "rasm.nc")}
if __name__ == "__main__":
    with Client() as client:
        with performance_report(filename="dask-report.html"):
            client.get(dask_graph, "source")

I also tried with the latest dask + distributed release and things works as expected.

Are you doing something different than what I described above?
What's the output of running $ dask info versions in your terminal? Also, what version of xarray are you using?

@jinmannwong
Copy link
Author

Thank for looking into this. I was running on a virtual environment that had a lot of other dependencies installed and indeed when I ran with just the required dependencies the problem didn't arise. I combed through the other dependencies I had and realised that the problem arises due to the installations of cupy-cuda11x=13.0.0 and jax=0.4.25 together. When I try running with the dependencies you listed and then one of cupy or jax, there is no problem.

The output of $ dask info versions is:

{
  "Python": "3.10.10",
  "Platform": "Linux",
  "dask": "2024.2.0",
  "distributed": "2024.2.0",
  "numpy": "1.26.4",
  "pandas": "2.2.0",
  "cloudpickle": "3.0.0",
  "fsspec": "2024.2.0",
  "bokeh": "3.3.4",
  "pyarrow": null,
  "zarr": null
}

and I am using xarray version 2024.2.0.

@jrbourbeau
Copy link
Member

Hmm that's interesting. I am able to reproduce when I install cupy-cuda11x=13.0.0 and jax=0.4.25. I'm not sure what the problematic dictionary is here that sizeof can't handle

cc @crusaderky @charlesbluca @quasiben in case someone has bandwidth to dig in a bit

@tuckerbuchy
Copy link

For what its worth, I'm experiencing this same issue when dealing with a large number of geojson formatted dictionaries. Not sure if that is a specific cause here or not, but have started having the same error as in the original post.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants