New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding Type Hints to Scrapy and its Modules #4041
Comments
Related to #3618 |
@elacuesta yes you are right , variable annotation were introduced in PEP 526 , only function annotations were introduced in PEP 484 , i was in in little hurry finding the PEP version. |
Based on mypy docs it seems that it's possible to provide the type hints as separate stub files, which shouldn't interfere with older versions of python that don't support them. Depending on the timeline for scrapy 2.0, this might be a way to fix this issue earlier... |
There is also comment-based syntax which works in 2.7. Anyways, I think that'd be great to test the waters on Scrapy dependencies first - w3lib, parsel. See e.g. scrapy/w3lib#123. Python 3 syntax is nicer though, so I'd prefer to go with Python3-only syntax first (3.5+, not 3.6+; for variable annotations one can use comment-based annotations). |
Considering there are already hundreds and thousands of lines code in Scrapy, Monkeytype may be a good choice to automatically add typing hint for existing code:
Before each data type is added manually, I think this can give some help for typing hint. |
I have tested MonkeyType, and get the following conclusions:
...
from unittest.mock import MagicMock
...
class RobotsTxtMiddleware(object):
DOWNLOAD_PRIORITY = 1000
def __init__(self, crawler: Union[Crawler, MagicMock]) -> None:
...
...
from scrapy.http import Request # original imports
...
from scrapy.http.request import Request # added by MonkeyType
... It seems Monkeytype can help for typing hint in some way, but manually checking the result is still necessary. |
@grammy-jiang it sounds similar to mypy's |
I just came across pyannotate which does essentially the same job as |
My 2c: I'd prefer high-quality type coverage of a small user-facing part of the Scrapy, over extensive automatically generated type hints, some of them incorrect. |
@laurentS Yes, Monkeytype just create stub files first then you can choose to add typing hint to the code manually ( @kmike if you want to a high-quality typing hint, I am afraid manually adding typing hint is the only way at this moment. Considering a large number of codes we have already in this project, it may take a very long time to add typing hint to all modules. Maybe splitting the work into several stages and implementing one by one is a good idea. |
This commit adds typing hint to httpcache downloadermiddlewares as a test for scrapy#4041
This pytest plugin pytest-mypy-plugins might help write tests to check that new typing annotations are correct. Thisblog post explains how to use it. It was written to help annotate Django, so it should be relatively solid. |
Well I think pydantic would be useful if type hints could be added to scrapy in future versions. It could help validate types, also make it easier to read configuration files and environment variables. After adding these settings, the code will be easier to understand. For example, |
@grammy-jiang Hey, is there any plan to add type hints to the rest of the mudules of scrapy? I am trying to understand some parts of the project and I think it would be very helpful if there are type hints. If possible, I would like to add some type hints and create a PR for them. |
Yes, we plan to do it, specially to those parts of the API that users interact with. Feel free to create a pull request to add type hints to some parts of the code. |
Hi, @ChihweiLHBird , You can go through the previous posts in this discussion, and also this PR: I haven't got time to work on this. Your work about adding typing hints will be very appreciated. Something you may need to note:
|
Having used pydantic before, I'd be keen on seeing it in scrapy. |
I ran MonkeyType through the unit tests across the scrapy codebase and saved the changes of all slix@2f22b84 <-- The result is a decent starting point for typing Scrapy. Contributors wouldn't have to run monkeytype and collect types from unit tests themselves. MonkeyType's output is too messy to commit directly. So each My output doesn't include An alternative to MonkeyType is pytype, which does static analysis. I now wonder whether that would produce better results.
Should any typing efforts be focused on the APIs shown in the Scrapy documentation instead of on internal code? Should methods starting in underscores be left untyped? |
Yes.
No, but they are indeed on the low end of type-hinting priorities. |
This is work in progress with its own |
Summary
We should add Variable Annotations/ Type hints as supported in PEP 526 , Python 3.6 to Scrapy to help out existing and new contributors and developers in understanding scrapy code.
Motivation
Intellisense enabled IDES like PyCharm need Type hints to provide better experience.
For new contributors to understand Scrapy comprehensively, type hints are vital.
Consider someone not that familiar with scrapy, stumbling upon scheduler's constructor.
The text was updated successfully, but these errors were encountered: