Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: add copy=None to CoordinateSequence.__array__ #2053

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

mwtoews
Copy link
Member

@mwtoews mwtoews commented May 14, 2024

Recent CI failures have picked up a recent NumPy 2.0 change, noted in the migration guide under "Adapting to changes in the copy keyword".

A simple fix appears to be to simply add copy=None to the __array__ signature. This value is not used by the method.


Here is an example failure:

________________________ TestCoords.test_data_promotion ________________________

self = <shapely.tests.geometry.test_coords.TestCoords object at 0x7f6bbc72fb90>

    def test_data_promotion(self):
        coords = np.array([[12, 34], [56, 78]], dtype=np.float32)
>       processed_coords = np.array(LineString(coords).coords)
E       DeprecationWarning: __array__ implementation doesn't accept a copy keyword, so passing copy=False failed. __array__ must implement 'dtype' and 'copy' keyword arguments.

shapely/tests/geometry/test_coords.py:16: DeprecationWarning

Note that test failures with macos in this PR are unrelated.

@jorisvandenbossche
Copy link
Member

Thanks! I had noted this as a TODO item in #1972, but didn't yet get to it (and didn't expect our CI to fail for it, but so this is a good reminder ;))

Ideally we would honor the copy keyword if passed. I think in our case, this just means raising an error if copy is False, because the coordinates we return are always a copy?
But this is less critical, and could also be done later in a follow-up.

@mwtoews
Copy link
Member Author

mwtoews commented May 14, 2024

I had a play with various copy=True/False/None versions, and it all seems minor impact, mostly because CoordinateSequence objects aren't really used as separate objects. Each .coords property access creates a new sequence object by default.

A simple implementation would be:

    def __array__(self, dtype=None, copy=None):
        if copy:
            return self._coords.copy()
        return self._coords

then tested (somewhere):

def test_coords_array():
    coord_seq = point.coords
    assert np.array(coord_seq) is not np.array(coord_seq)
    assert np.array(coord_seq, copy=False) is np.array(coord_seq, copy=False)
    assert np.array(coord_seq, copy=True) is not np.array(coord_seq, copy=True)

@coveralls
Copy link

coveralls commented May 14, 2024

Pull Request Test Coverage Report for Build 9087673717

Details

  • 6 of 6 (100.0%) changed or added relevant lines in 1 file are covered.
  • No unchanged relevant lines lost coverage.
  • Overall first build on numpy-copy at 88.138%

Totals Coverage Status
Change from base Build 9076561797: 88.1%
Covered Lines: 2623
Relevant Lines: 2976

💛 - Coveralls

@jorisvandenbossche
Copy link
Member

Hmm, yes, it depends on how you interpret the CoordinateSequence object. Is that just a wrapper around an already materialized numpy array self._coords (how it is implemented now in practice) or is it a logical abstraction around the geometry's coord seq (how it was actually also implemented in shapely 1.x)?

Because in the former case, you can indeed say that np.array(geom.coords, copy=False) is fine because getting the .coords object in geom.coords itself has already made a copy, and so the subsequent conversion to numpy does not require a copy.
But in the latter case you could argue that np.array(geom.coords, copy=False) should raise an error because it can never the the geom.coords without involving a copy of the actual values (like, np.array(geom.coords, copy=False) is np.array(geom.coords, copy=False) will never be true).

Now, since this CoordinateSequence class is a separate object from the Geometry objects, it might be that the former interpretation is the only viable one.

@mwtoews mwtoews changed the title BUG: add copy=None to CoordinateSequence.__array__ for NumPy 2 ENH: add copy=None to CoordinateSequence.__array__ May 14, 2024
@mwtoews
Copy link
Member Author

mwtoews commented May 14, 2024

From The __array__() method docs "where copy=False should raise an exception if a copy is needed". And the basic dispatch example raises ValueError for this case. This PR is now updated, renamed as an enhancement.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants