Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PERF: optimize == and != for GeoSeries (GeometryArray.__eq__) #3257

Open
jorisvandenbossche opened this issue Apr 18, 2024 · 0 comments
Open

Comments

@jorisvandenbossche
Copy link
Member

jorisvandenbossche commented Apr 18, 2024

The (in)equality for GeometryArray is defined here:

geopandas/geopandas/array.py

Lines 1656 to 1667 in 167c061

# If the operator is not defined for the underlying objects,
# a TypeError should be raised
res = [op(a, b) for (a, b) in zip(lvalues, rvalues)]
res = np.asarray(res, dtype=bool)
return res
def __eq__(self, other):
return self._binop(other, operator.eq)
def __ne__(self, other):
return self._binop(other, operator.ne)

So this essentially still uses a python for loop acting on scalar shapely geoometries, instead of one of the vectorized ufuncs from shapely.

Now, if we want to ensure to keep this consistent with shapely's geometry __eq__, there is not actually a direct ufunc equivalent (shapely switched to use equals_exact in 2.0.0, but that was reverted because of ignoring th z dimension, shapely/shapely#1732).
But for a future shapely 2.1.0, there will be a ufunc that is equivalent to the scalar __eq__, i.e. equals_identical (shapely/shapely#1760), and then we should ensure to update our code here to use that ufunc instead of the slow python loop.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant