Skip to content

dask.array.stack seems grow quadratically and is very slow, but dask.delayed(np.stack) grows linearly? #7116

Answered by SteffenBauer
SteffenBauer asked this question in Q&A
Discussion options

You must be logged in to vote

Update: This pull request seems to have found the problem, looks like it really was an inefficiency in the dask code:

#7402

I retried above code, now I get linear time when I use dask.array.stack:

Data size   5000: Data preparation time 0.14s, stack time 1.61s, compute time 1.20s
Data size  10000: Data preparation time 0.41s, stack time 3.03s, compute time 2.28s
Data size  20000: Data preparation time 0.52s, stack time 6.16s, compute time 5.73s
Data size  40000: Data preparation time 1.11s, stack time 13.32s, compute time 12.92s

Still slower than the delayed code, but significant improvement.

Replies: 2 comments

Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
0 replies
Answer selected by SteffenBauer
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants