New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dask support #278
Comments
Thank's for the contribution, that's great! I did not use dask myself yet, so I cannot provide a precise feedback. Just a question: in dask, are each element processed with equal time, or 2016-10-01 13:30 GMT+02:00 Alexander notifications@github.com:
|
@lrq3000 well, in dask "tasks" are chunks - the count of which you usually specify when creating dask dataframe or something. Each chunk in case of dask dataframe is a pandas dataframe, and we can have callback only once for each chunk. I don't think it's possible to get the count of individual rows or items in the general case because on the low level dask is just a scheduler for various multiprocessing stuff. Of course, in general tasks can take very different times to process, but for most common things like applying functions to dataframes (I only use dask a a parallel pandas DataFrame.apply) and such - they usually complete in about the same time. |
Ok Alex, then I don't think it's possible to do better for now, you already 2016-10-01 14:40 GMT+02:00 Alexander notifications@github.com:
|
@lrq3000 I don't think monkeypatching makes sense here - the very basic progressbar built-in in dask is actually used the same way as I wrote above:
So, something like |
Ok thank you Alexander, indeed it's better to follow the dask's API, we 2016-10-01 16:37 GMT+02:00 Alexander notifications@github.com:
|
So I guess I'll make a PR soon. |
If you want to be accredited as author in the commits then yes, feel free
|
This is very interesting. Has this been merged into a current release? |
Not yet merged but it will be.
|
To be more precise: this will get merged after #198, and before that I'm waiting for someone to review some bugfixes and enhancements. So don't hold your breath, it will be merged eventually before Christmas I think, but I'm not sure when. |
politely bumps thread :) |
Hey, wanted to follow up on this and show a dask distributed version for tqdm. This is based off the progress bars built here: https://github.com/dask/distributed/blob/master/distributed/diagnostics/progressbar.py
If you run in a notebook, you'll get the widget version.
|
@thomasaarholt Is there a way to support a custom description for this ProgressBar callback? |
Sure! Just modify the |
@thomasaarholt I meant like how you would pass as an argument a description to
I would want that every context manager would have it's own description; how can I modify ProgressBar class to allow such functionallity? For example
|
Ah, sure! I'm not able to check right now, but I reckon that if you add a __init__(self, desc):
self.desc=desc And inside tqdm() set |
try out #1079 ( |
I have been working with #1079 and have found it is not compatible with the most recent dask release 2021.2.0. File "c:\users\deschman\spyder-env\lib\site-packages\dask\base.py", line 281, in compute |
I came back to this after improving a StackOverflow post asking the same. The following is a working solution for dask progressbar based on tqdm: from dask.callbacks import Callback
from tqdm.auto import tqdm
class ProgressBar(Callback):
def __init__(self, desc=""):
self.desc = desc
def _start_state(self, dsk, state):
self._tqdm = tqdm(total=sum(len(state[k]) for k in ['ready', 'waiting', 'running', 'finished']), desc=self.desc)
def _posttask(self, key, result, dsk, state, worker_id):
self._tqdm.update(1)
def _finish(self, dsk, state, errored):
pass Use it as: with ProgressBar("your description"):
arr.compute() # your Dask computation here |
er #1079 (vis. https://github.com/tqdm/tqdm#dask-integration & https://tqdm.github.io/docs/dask/) is probably better |
Ah! Brilliant, I will use that instead! |
how apply |
Dask itself has a basic progressbar, but tqdm is certainly better - so I made a basic wrapper:
Usage (the same as dask progressbar):
Does it belong to
tqdm
, what to you think? Also, any further suggestions/improvements?The text was updated successfully, but these errors were encountered: