New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Investigate simplifying static checks for initialized Crawler #6136
Comments
Hi! @yueric985 and I are students are we are working on a project for a class that requires us to make an open-source contribution to a library. We will be working on this issue as an example. |
Hi! Can I work on this? |
Sure.
This means doing fewer asserts. |
Question about the issue. I understand that several components of Crawler are initialized later. However, is it ensured that these components such as self.crawler.stats is always initialized before calling a method such as self.crawler.stats.inc_value()? If so, would doing something like self.stats = cast(StatsCollector, self.stats) be a valid solution? This would tell mypy to treat self.stats as an ainstance of StatsCollector for the rest of the method so we wouldn't have to use an assert statement before accessing self.stats. |
I don't think this code does anything? Also, where would you put it?
Oh, you mean in each method? That's the same amount of code as an assert. |
Yes, I see the issue. I have another proposed solution to use a proxy object for crawler.stats that initializes the actual stats object the first time it is accessed. This way, we would not need to use assrts or if statements to check if stats is initialized. `class StatsProxy:
class Crawler:
StatsProxy intercepts all attribute accesses to crawler.stats and initializes stats the first time it's accessed. This way we can be sure that stats is always initialized before use without using asserts. I have a sample initialize_stats() method that I haven't filled out. Please let me know if this make sense and I'd appreciate feedback and your thoughts on this solution. |
That's the problem, it shouldn't be initialized on access, it's initialized by a separate method and the question of this ticket is how can we tell the checkers that. |
After we changed the Crawler logic to initialize components such as
Crawler.stats
later, static checkers (specifically mypy) don't know that when code likeself.crawler.stats.inc_value()
is called,self.crawler.stats
is already initialized, so we need to add asserts in all methods that do this. We should try to find a way to do the same with fewer asserts. This also applies to 3rd-party code (middlewares etc.).The text was updated successfully, but these errors were encountered: