-
-
Notifications
You must be signed in to change notification settings - Fork 5.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: Speed up sparse.csgraph.dijkstra 2.0 #20717
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good to get this moving again :) Thanks for the work @tsery-ns!
I've had a quick scan of it and it looks good. I won't approve it yet as I need to do a more thorough review first. I'm pretty busy with work at the moment, but I'll try to fit it in either over this weekend or the following one.
Hi @j-bowhay, I would like to, but I just need some time. |
@@ -419,9 +449,11 @@ def test_yen_undirected(): | |||
source=0, | |||
sink=3, | |||
K=4, | |||
directed=False, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's going on here? Is a bug being fixed/behavior being changed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bug fixed. The test is meant to be "undirected" but the parameter was not passed properly.
72f0a32
to
4e4ca7f
Compare
Is there a build expert in the audiance who can help me mitigate the build warning:
@rgommers perhaps? |
scipy/sparse/csgraph/meson.build
Outdated
cython_gen_csgraph_for_cpp.process(pyx_file[1]), | ||
cpp_args: cython_cpp_args, | ||
include_directories: inc_np, | ||
dependencies: py3_dep, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
py3_dep
isn't needed. np_dep
is needed I assume - and that should take care of the build warning about deprecated API.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for super-seeding this work, @tsery-ns!
I have limited time, but here is another pass. I think a few allocations can be removed; I believe the implementations can be changed so that a scalar placeholder value (like None
) can be passed instead of placeholders arrays whose unneeded allocations are costly.
If I were to have more time, I would look into the algorithmic and Cython details of it.
Side-note: scikit-learn sets the following Cython directives:
language_level=3
boundscheck=False
wraparound=False
initializedcheck=False
nonecheck=False
cdivision=True
profile=False
Would it be relevant for SciPy?
dist_matrix[0, source] = 0 | ||
int[:] predecessor_matrix = np.full((N), NULL_IDX, dtype=ITYPE) | ||
double[:] dist_matrix = np.full((N), np.inf, dtype=DTYPE) | ||
int[:] dummy_source_matrix = np.empty((0), dtype=ITYPE) # unused |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment: is it possible to avoid this allocation?
@@ -582,48 +584,56 @@ def dijkstra(csgraph, directed=True, indices=None, | |||
else: | |||
predecessor_matrix = np.empty((len(indices), N), dtype=ITYPE) | |||
predecessor_matrix.fill(NULL_IDX) | |||
source_matrix = np.empty((len(indices), 0), dtype=ITYPE) # unused |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would recommend against allocating this array if it is not required.
else: | ||
predecessor_matrix = np.empty((0, N), dtype=ITYPE) | ||
predecessor_matrix = np.empty((len(indices), 0), dtype=ITYPE) | ||
source_matrix = np.empty((len(indices), 0), dtype=ITYPE) # unused |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similarly, I would recommend against allocating this array if it is not required.
dummy_double_array = np.empty(0, dtype=DTYPE) | ||
dummy_int_array = np.empty(0, dtype=ITYPE) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a way to omit those unneeded allocations?
@@ -68,7 +70,7 @@ def shortest_path(csgraph, method='auto', | |||
Computational cost is approximately ``O[N^3]``. | |||
The input csgraph will be converted to a dense representation. | |||
|
|||
'D' -- Dijkstra's algorithm with Fibonacci heaps. | |||
'D' -- Dijkstra's algorithm with priority queue. | |||
Computational cost is approximately ``O[N(N*k + N*log(N))]``, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does the complexity change in this case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It appears the complexity currently stated is wrong even before the change. Dijkstra's worst complexity without a priority queue is O(N^2)
. So the current complexity is definitely not O(N^2 * log(N))
The correct complexity should be O[(N*k + N)*log(N)]
(assuming N*k
is a logical way to compute the number of edges in the graph).
Should we state the known complexity of O[(E + N)*log(N)]
where E
is the number of edges?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would recommend compiling this file with code annotation to understand if those Cython implementations can be technically improved further with a few changes to the Cython directives.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've run the annotation for you, here is the output file. Unfortunately, GitHub doesn't let you upload HTML files. I changed the file extension to .txt — you'll just have to change it back to .html again.
@jjerphan thank you for taking the time to review my PR.
I have implemented a change so these unused allocations are removed. However, that comes with a price. With the change, the inner methods are not completely 'Cythonic' (i.e., there are yellow lines in the annotation). This is not critical for performance, as the most inner method
The |
4e4ca7f
to
0a09ae4
Compare
Friendly reminder for reviewers :) |
I can't guarantee having some time soon for a second comprehensive review. 😕 |
There is one test failure left: https://github.com/scipy/scipy/actions/runs/9316113382/job/25643666763?pr=20717 |
I'll take a look over the weekend @tsery-ns. |
I require assistance with this issue: |
0a09ae4
to
60b0048
Compare
This comment was marked as off-topic.
This comment was marked as off-topic.
60b0048
to
e34bfe2
Compare
I've run the benchmarks locally and here are the results. On the current main branch:
On this branch:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really like this!
The main issue I have at the moment are the unneeded allocations mentioned by @jjerphan.
There's also a lot of yellow in the Cython annotations. Although I don't think that needs to be fixed in this PR.
@da-woods : I think we need your expertise to understand the remaining Cython compilation error with CPython's free threaded build. |
This comment has been minimized.
This comment has been minimized.
Just to confirm, that's:
???? I'd suspect that's a bug in the current Cython master vs whatever Cython release you're using for the rest of the builds. I doubt it's related to free-threading particularly. I'll try to have a more detailed look later. |
Right... got to the bottom of your free-threading build issue... It is indeed a slight behaviour change from 3.0 -> master. Cython has some slightly different behaviour depending on if it believes it's compiling in C++ mode or in C mode. Specifically to do with having multiple definitions of the same function. This is controlled by the flag The upshot is that Cython believes its operating in C mode so doesn't apply the extra C++ rules. On the Cython side I'm going to make a small modification so that it always treats From your side I think you should be passing the |
but add some warnings for people who are using them outside C++ mode (since some features, especially free-functions likely won't work as they expect). Related to scipy issue scipy/scipy#20717 and change in cython#3235.
but add some warnings for people who are using them outside C++ mode (since some features, especially free-functions likely won't work as they expect). Related to scipy issue scipy/scipy#20717 and change in cython#3235.
Thank you everyone for helping out with the build issue. I will sort it out, together with the conflicts, in the next couple of days.
As I described in an earlier comment, it is possible to avoid the allocation with a commit I currently left out of the PR. Including this commit introduces more yellow lines in the annotation, but not on the heavy loop of the algorithm.
Once this PR is merged, I intend to work on adding benchmarks and optimizations to this module. I have a few ideas in mind, working on yellow annotations is one of them. |
…is slow due to a bug with std::priority_queue
…s.test_shortest_path
Remove trailing spaces, unused variables and imports, change reference of Fibonacci heap to priority queue
Add missing spaces between tests.
e34bfe2
to
9406182
Compare
Introduction
Rebase, reorganization, and code review of PR 17019.
Original description
Another pull request to speed up sparse.csgraph.dijkstra, succeeding #16696
Thank you @jjerphan for reviewing the previous pull request and giving me advice
( See benchmarks in PR 17019).
Changes from original PR