Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DOC:Add proper type annotations and consolidate documentation into top level functions #1687

Draft
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

idanmiara
Copy link
Contributor

@idanmiara idanmiara commented Dec 24, 2022

Discussion: #721

This PR will be broken down into multiple pieces, first piece is #1741

Closes #1683
Closes #1688
Related to #1673
Related to #1316

This PR is adding proper type annotations for almost all of shapely functions (including the vectorized functions).
And consolidating documentation into top level functions for all relevant functions and replacing duplicated documentation in reference to the relevant functions.

I think this is a very big step in direction of better documentation of Shapely 2.

This is a huge PR, and I could break it down to pieces if required 😵‍💫 upon initial review of the general ideas and as per your suggestions and comments.

@jorisvandenbossche @caspervdw
I'd really appreciate your review.

@idanmiara idanmiara changed the title Add proper type annotations for array like + usage examples Add proper type annotations for EVERYTHING Dec 26, 2022
@idan-miara idan-miara force-pushed the typing branch 7 times, most recently from 17e9cb3 to 3451bf6 Compare January 1, 2023 11:38
@idanmiara idanmiara changed the title Add proper type annotations for EVERYTHING Add proper type annotations and consolidate documentation into top level functions Jan 1, 2023
@idanmiara idanmiara changed the title Add proper type annotations and consolidate documentation into top level functions DOC:Add proper type annotations and consolidate documentation into top level functions Jan 2, 2023
@idanmiara
Copy link
Contributor Author

@jorisvandenbossche any comments on this PR please?

@brendan-ward
Copy link
Collaborator

Thank you for your work on this @idanmiara and apologies for the slow review of PRs. Your effort toward improving Shapely is appreciated!

Adding type annotations is a big addition that needs to be weighed carefully and needs buy-in from several of the Shapely maintainers, because this adds to the overall maintenance burden of the project as well as potentially making it harder for new contributors, especially for those of us that don't use them on a regular basis. This is big enough that it probably warrants taking it up first as a discussion in order to get buy-in, rather than via PR review here.

Even without a close review of the changes here, I think this needs to be split up a bit differently into smaller, more focused PRs. Specifically, documentation changes that are unrelated to type annotations should be addressed in a separate PR; those should theoretically be easier to review / merge. I'd recommend focusing on the documentation changes first; I'm hoping those are easy to copy / paste out of here. Apologies that this involves extra work for you.

In general, small & focused PRs are more manageable to review given limited resources; big & complex PRs are very challenging to review and often take more time to review than folks have available - which means that they don't get reviewed at all for long periods of time. I can't speak for other Shapely contributors here, but my experience has been that if I can't review a PR within a narrow window of time that I have available, I often can't get to it at all in a reasonable amount of time - which is surely frustrating to folks that are actively trying to make the project better like yourself.

@idanmiara
Copy link
Contributor Author

idanmiara commented Jan 18, 2023

Hi @brendan-ward,
Thanks for your comments!
I think that type annotations make using the library MUCH easier for both new and existing users - that's why I was willing to spend so much time adding them.
Type annotations make it easier to understand what are the appropriate inputs and outputs, both programmers and for IDEs and automated tools.
https://realpython.com/lessons/pros-and-cons-type-hints/

In this PR I've added type annotation to practically all of shapely functions.
Although I don't think it would be hard for contributors to follow the convention and add type annotations to new functions - we don't need to make it mandatory.

@coveralls
Copy link

coveralls commented Jan 27, 2023

Pull Request Test Coverage Report for Build 4026491040

  • 431 of 450 (95.78%) changed or added relevant lines in 32 files are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage increased (+0.8%) to 87.525%

Changes Missing Coverage Covered Lines Changed/Added Lines %
shapely/_geometry.py 25 26 96.15%
shapely/affinity.py 10 11 90.91%
shapely/algorithms/cga.py 4 5 80.0%
shapely/creation.py 16 17 94.12%
shapely/geometry/base.py 62 63 98.41%
shapely/geometry/geo.py 12 13 92.31%
shapely/io.py 7 8 87.5%
shapely/linear.py 15 16 93.75%
shapely/measurement.py 12 13 92.31%
shapely/set_operations.py 12 13 92.31%
Totals Coverage Status
Change from base Build 4026134479: 0.8%
Covered Lines: 2617
Relevant Lines: 2990

💛 - Coveralls

@jorisvandenbossche
Copy link
Member

As @brendan-ward mentioned, I think it would be good to first have some discussion about type annotations in an issue, to see how the other maintainers think about it (I personally also have reservations, but won't object).
Previous discussion about this: #721

And also if we are adding type annotation, I would highly recommend starting with annotating a small part so that it is easier to review (I expect that the first PRs will generate quite some back and forth about the exact form of the type annotations that we want, and once that is somewhat agreed upon, subsequent PRs expanding the coverage should go much smoother).

@jorisvandenbossche
Copy link
Member

FWIW I personally also prefer that we keep annotation syntax just for annotations, and not start using it in docstrings (IMO a docstring with semi-fluent text like "Geometry or array-like of geometries" reads much easier and is more understandable than "MaybeGeometryArrayNLike")

@idanmiara
Copy link
Contributor Author

idanmiara commented Jan 27, 2023

As @brendan-ward mentioned, I think it would be good to first have some discussion about type annotations in an issue, to see how the other maintainers think about it (I personally also have reservations, but won't object). Previous discussion about this: #721

And also if we are adding type annotation, I would highly recommend starting with annotating a small part so that it is easier to review (I expect that the first PRs will generate quite some back and forth about the exact form of the type annotations that we want, and once that is somewhat agreed upon, subsequent PRs expanding the coverage should go much smoother).

@jorisvandenbossche thanks for this pointer!
Would you want to reopen #721?
I don't mind making disecting this PR - would you like to propose which section do you think would be the best to discuss first?
This PR is a combination of two efforts :annotations + documentation consolidation.
I'm not sure what would be a good way to split.

@jorisvandenbossche
Copy link
Member

This PR is a combination of two efforts :annotations + documentation consolidation.
I'm not sure what would be a good way to split.

I think annotations and documentation changes are completely orthogonal? (assuming we don't use annotation-like syntax in the docs)
So that is an obvious split I would say.

would you like to propose which section do you think would be the best to discuss first?

I would start with a single file, and base.py might be the easiest to start with (and also the one with potentially most value)

@idanmiara
Copy link
Contributor Author

idanmiara commented Jan 27, 2023

would you like to propose which section do you think would be the best to discuss first?

I would start with a single file, and base.py might be the easiest to start with (and also the one with potentially most value)

Sure, no problem.

I think annotations and documentation changes are completely orthogonal? (assuming we don't use annotation-like syntax in the docs) So that is an obvious split I would say.

Agree, but there would be many conflicts to solve if we handle both issues concurrently (editing sections close to each other).
Thus I'm thinking maybe to do the doc consolidation effort first while we discuss the annotations on a small scale.

How would you advise to split the consolidation PR?
I'm thinking, once #1673 and #1231 are merged I'd apply the same logic for doc consolidation all the functions and split it into a few PRs - I'm was thinking to split it maybe like this:

  • all the small fixes in a single PR
  • some bigger changes like snap each in a separate PR
  • then maybe a PR per module

@jorisvandenbossche - What do you think?

Although this PR is not directly related to #1682, the edits are also close to each other and also the doctests consolidation work is effected by #1711, so it would make my work much easier if we could solve that issue first, so if you are thinking we could close that one soon then I'll wait for it. Thanks!

@sgillies
Copy link
Contributor

@idanmiara I don't understand why you take on very big new efforts in this project instead of working on the many smaller issues? We have 100+ open issues in this project.

@idanmiara
Copy link
Contributor Author

@idanmiara I don't understand why you take on very big new efforts in this project instead of working on the many smaller issues? We have 100+ open issues in this project.

Hi @sgillies,
Thanks for asking 😄 here is a my non-exhaustive list of reasons that came to my mind:

... instead of working on the many smaller issues? We have 100+ open issues in this project.

Many times I take smaller existing issues and contribute for the following reasons:

  1. Reviewing to solve existing PRs to help other contributors: ENH: add normalize keyword to equals_exact() to normalize input geometries #1231 Add perspective (homography) transform #1448 ENH: Add constrained_delaunay_triangles function #1685. [I believe that we need more contributors]
    (Sustainable Shapely support #1702) and I believe that reviewing PRs help bring in new contributors.
  2. Issues that I think that solving them would make it easier for others to contribute because things would be more
    organized, like TST: Move all remaining /tests to /shapely/tests #1681
  3. Issues that fail tests and make it seems like there is a bug when there isn't, like TST: skip failing test on GEOS master #1715
  4. For the sake on learning, helping others or giving back to the community, like BUG: geometry constructors now support Decimal inputs #1720 - I couldn't care less about supporting inputdecimals 😆 , but I understood it does matter to for other people so to mark it as fix in 2.0.1.
  5. Smaller Issues that effect others even if might not effect me personally, like DOC: 2.x.rst - Add API REFERENCE link #1721 BUG: fix directed keyword in shapely.ops.linemerge #1695, BUG: solve minimum_rotated_rectangle is incorrect in 2.0.0 #1708
  6. Issues that effect my workflow significantly, bug me personally or slow me down when working on other issues that do matter, like DEV: .gitignore - add *.dylib, *.dll to ignore binaries from MacOS and Windows #1703, BUG: enum.py renamed to _enum.py to avoid name clash with the built in enum module #1694, CLN: use absolute imports throughout the codebase #1682
  7. Issues that effect performance of critical parts of my code, like ENH:consolidate shapely.ops.transform (+speedup) with shapely.transform #1676 ENH:base.py - add coords_array and xy_array properties #1722

I don't understand why you take on very big new efforts...

  1. As for this specific issue - when I want to learn a new library I often read the documentation or go over the code. If the documentation is fragmented or hard to understand I am trying to fix parts that I was able to figure out so it would be easier for others and also for myself for future reference, this helps me understand better how things work. That was also the case with DOC: snap: merge documentation and examples from ops and constructive #1673. Then I saw that the fragmented documentation is all over because of the merge of pygeos into shapely so I wanted to consolidate the documentation so it would be easier for me to find the right documentation when I need it. Sometimes it worth for me to do it in a large scale all at once if I believe I would need to read and use large parts of the library - so I went on a quest to consolidate the documentation as I also believe this would help others to use the shapely better and more easily.
  2. But I'm guessing that you were referring to the type annotations major effort: for me they play a MAJOR part in making the documentation readable: Type annotations make it easier to understand what are the appropriate inputs and outputs, both programmers and for IDEs and automated tools. For instance, I am working on performance boost for some internal code, thus I need to understand better what are all the options for proper inputs and outputs for the different functions (Geometry or array_like proper type annotation ? #1683). This also help to find bugs in internal code as the IDE and automate tools points point out if the inputs/outputs for a function are wrong.

So to summarize - for me - this issue is one of the most important one to solve (right after solving regressions), and worths my time spending on solving it.
I'm positive that this effort would make it MUCH easier other users (and especially new users and contributors) to make the best out of shapely.
I'm certainly onboard helping with your concerns about them.

@idanmiara idanmiara marked this pull request as draft January 29, 2023 20:02
@HaraldTR
Copy link

This is fantastic work imo.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
7 participants