Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Client() control scope of resources #6805

Open
pseudotensor opened this issue Nov 4, 2020 · 4 comments
Open

Client() control scope of resources #6805

pseudotensor opened this issue Nov 4, 2020 · 4 comments

Comments

@pseudotensor
Copy link

pseudotensor commented Nov 4, 2020

Currently I cannot see way to manage both GPUs and CPUs using dask "resources".

Suppose I have 2 machines each with 1 GPU and 8 cores. I can set labels for GPU and CPU counts, but dask_cudf/RAPIDS side requires 1 worker/GPU. But then for any CPU tasks I am suddenly forced to having 1 worker process for an entire node. I can set nthreads to all cores divided by number of GPUs. For numpy/panda/scipy things that release GIL that might be kinda ok, but as docs say this is not optimal if something does not release GIL as threads become useless except for I/O.

If I try to add extra workers just for CPU resources and have other workers just for GPU resources, this does not work either. Packages like xgboost automatically consume all workers, as the only way to restrict resources currently is to use .compute() or client.submit, but the client itself cannot be restricted it seems. One hits: dmlc/xgboost#6344 because of this. If I could call Client() with a "resources=" argument so any use of that client would be using limited resources like .compute, .submit, then that would work. But AFAIK, I see no such possibility even though would be more general that current scheme of only allowing for .compute, .submit, etc.

I can use 2 schedulers for each type of resource, but that defeats purpose of scheduling and resource management. E.g. I know that xgboost is pretty efficient using only GPU and not CPU, so I could run CPU and GPU at same time roughly. But other packages like lightgbm use alot of CPU even when running on GPU, so such a split of scheduling would drag system to crawl with other dask tasks.

Basically, request is for Client() itself to consume resources requests to limit scope of all dask tasks, not just ones that take explicit resources like .compute, client.submit since often we do not use dask in such a fine-grained way, like for xgboost package we have no such control.

@pseudotensor
Copy link
Author

I'm curious if one work-around is to use the client.submit with resource control and then inside that call do some dask operations. It seems a bit unwieldy but might work.

@sjperkins
Copy link
Member

sjperkins commented Nov 5, 2020

@pseudotensor This is probably a good use case for task annotations. We've made some progress towards adding them already. The remaining work is to transmit and unpack them on the distributed scheduler.

@sjperkins
Copy link
Member

@pseudotensor
Copy link
Author

pseudotensor commented Nov 5, 2020

@sjperkins Thanks. It's seems (to me naively) that just providing a resources to Client() is more straight-forward for user than extra contextual annotations. That is, one can already use client as a context manager, and if one could pass priority/resouces/etc. to Client isn't that enough? Why does there need to be yet another context manager?

I'll check out the docs you pointed to to see if I can understand. It's not clear to me what is available now vs. in development.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants