You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This is an unverified bug, so really more just startling behavior that I would like to understand.
Unfortunately the reproducer from this is high energy physicist analysis code and I have not yet been able to create a concise reproducer for this behavior. High energy physics task graphs tend to be quite large, thousands of layers, I'm not sure if this is somehow related.
Steps for a "maximal" reproducer are here: https://gist.github.com/lgray/8a28c5fcd707a2a6778f92cd598f0ca6
I'll continue to try to find something minimal but it really takes time to figure out from non-expert code.
You'll have to setup calls to dask.compute yourself. I'm happy to help you.
In both cases I am using a local distributed client to perform the computation.
There are two major issues, one being that that taskgraph in the client.compute case doesn't appear to be optimized or is partially optimized (the number of tasks is different by 3k), and the other that the later calls to agglomerating histograms seem to be stalling for reasons I cannot deduce in the client.computecase. Along these lines the client.compute case is nearly 2x slower than the dask.compute case, and the memory usage in the client.compute case is 3x-4x than the dask.compute case.
Of note - it seems that the Client thread itself is stalling when using client.compute, since the dashboard stalled when the histogram tree-reduce starts. However, this doesn't happen in the dask.compute case, which is truly odd.
If I optimize the task graph ahead of time and pass that to client.compute the number of tasks run by client.compute makes sense but the stalling issue and corresponding compute slowdown are still there.
This is an unverified bug, so really more just startling behavior that I would like to understand.
Unfortunately the reproducer from this is high energy physicist analysis code and I have not yet been able to create a concise reproducer for this behavior. High energy physics task graphs tend to be quite large, thousands of layers, I'm not sure if this is somehow related.
Steps for a "maximal" reproducer are here:
https://gist.github.com/lgray/8a28c5fcd707a2a6778f92cd598f0ca6
I'll continue to try to find something minimal but it really takes time to figure out from non-expert code.
You'll have to setup calls to
dask.compute
yourself. I'm happy to help you.Here is the performance report with
client.compute
:wwz-dask-report-client-compute.html.zip
Here is the performance report on the same code and task graph for
dask.compute
:wwz-dask-report-dask-compute.html.zip
In both cases I am using a local distributed client to perform the computation.
There are two major issues, one being that that taskgraph in the
client.compute
case doesn't appear to be optimized or is partially optimized (the number of tasks is different by 3k), and the other that the later calls to agglomerating histograms seem to be stalling for reasons I cannot deduce in theclient.compute
case. Along these lines theclient.compute
case is nearly 2x slower than thedask.compute
case, and the memory usage in theclient.compute
case is 3x-4x than thedask.compute
case.Of note - it seems that the Client thread itself is stalling when using
client.compute
, since the dashboard stalled when the histogram tree-reduce starts. However, this doesn't happen in the dask.compute case, which is truly odd.If I optimize the task graph ahead of time and pass that to
client.compute
the number of tasks run byclient.compute
makes sense but the stalling issue and corresponding compute slowdown are still there.cc: @martindurant
The text was updated successfully, but these errors were encountered: