Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

distributed upgrade from 2022.03.0 to 2024.2.0 has performance issues. #8646

Closed
yiershanxll opened this issue May 9, 2024 · 2 comments
Closed

Comments

@yiershanxll
Copy link

yiershanxll commented May 9, 2024

Problem:
We tested 5 times, and each time the problem occurred at 25 minutes.
the error message "distributed.comm.core.CommClosedError: in <TLS (closed) Scheduler Broadcast local=tls://182.10.4.6:58090 remote=tls://182.10.2.6:18715>: Stream is closed" is displayed.
This problem does not exist in earlier versions:dask==2022.03.0.
Although the task is error, the background worker executes the task properly until the calculation is complete.

Environment information:

  • Dask version: 2024.2.0
  • distributed version: 2024.2.0
  • pandas:2.0.3
  • pyarrow:14.0.1
  • Python version: 3.9.11
  • Operating System: suse12.5

Number of nodes: Two containers with 8 vCPUs and 16 GB memory are deployed.
Number of workers: Two workers are started using the dask command. Each worker has five processes and one thread. The memory usage is limited to 90%. A total of 10 processes are processed in the background.
Distributed computing: We use the client.run method to submit tasks to each worker for processing. The input processed by each worker is a file. Pandas is used for processing, and dask.dataframe is not used. The output is also a file.

@yiershanxll yiershanxll changed the title dask upgrade from 2022.03.0 to 2024.2.0 has performance issues. distributed upgrade from 2022.03.0 to 2024.2.0 has performance issues. May 9, 2024
@yiershanxll
Copy link
Author

distributed.yaml worker-ttl param need to set null

@fjetter
Copy link
Member

fjetter commented May 21, 2024

Just driving by: Client.run is not necessarily meant for users to run their computations. This is mostly used for diagnostics purposes, debugging and occasionally for more exotic things. As the docs for Client.run already suggests, this function is running outside of the task scheduling system.

Users should instead use Client.submit to schedule individual functions.

You will also noticed that with Client.run, the dashboard is not actually working just like many other features will not work

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants