-
Notifications
You must be signed in to change notification settings - Fork 557
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
get_coordinates(point) is 4x faster than (point.x, point.y) #1995
Comments
Yeah this has been on my mind for a while. We use shapely a lot for work, and when we do our optimization passes, we always go through and just do x = point.x and then use x whenever we need the value. I wasn't aware of get_coordinates() I do struggle to understand why a calculation needs to be done at all for x and y and why they aren't just stored. Perhaps i'm missing something basic, but operations aren't done in place that I can think of, so I don't fully get why you would need a calculation - but I also haven't dove into the GEOS library to see. It definitely should at the very least be a note in the docs as it can be a significant slowdown to codebases. |
Given what shapely is doing here is calling respective GEOS functions here, the question boils down to why Also note that the comparison above is not precise as doing
|
FWIW, it is mostly the ufunc machinery that adds a (relatively speaking) significant overhead when called for a single geometry object (and See #1021 for a general issue about the ufunc overhead and performance of scalar methods/attributes I quickly tried what it would take to specifically optimize the Point coord access attributes, see #2035, which gets the |
Wow - that's quite the speed up. I don't fully understand how to use the ufunc functionality yet (I plan to learn it) but to me that seems like a very good improvement. Does that change make it so that it is no longer possible to use ufuncs for .x? One thing I did notice while trying to understand the codebase (I'm still a far ways out) is that getX() in Point.cpp in geos checks isEmpty() { However, getCoordinate() also checks isEmpty I tried to setup a custom version of GEOS to test it but I'm doing something wrong and my python instance ends up not being able to find shapely.lib, so I need to try some more. But I would think that removing one of those isEmpty checks should also give a nice speed up Edit: wow just experimented with using ufunc and it's a game changer. I learned about STRTree a few months ago and that made a huge time difference for me, this is going to be another one. Thank you and all of the other maintainers for your great work on this package |
No, the change is separate from the ufuncs, it just adds a specialized implementation for just the
My guess is that this |
At this moment, this is not a bug report nor a feature request. I primarily want to highlight the fact that
get_coordinates(point)
is approximately 4x faster than usingpoint.x, point.y
.I believe this matters substantially as the
point.x, point.y
pattern is very popular for retrieving Python float numbers for a given point. It is also quite often present in hot loops.A search on GitHub for 'shapely ("point.x, point.y" OR "p.x, p.y")' returns 1.5k code results, many of which are inside of for loops.
Benchmark
The text was updated successfully, but these errors were encountered: