Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set up Mypy and gradual static typing #858

Closed
wants to merge 10 commits into from

Conversation

Zac-HD
Copy link
Member

@Zac-HD Zac-HD commented Sep 14, 2017

This is a starting point for #200, focusing on static typing as another check in our own tests. This has already caught a bug (#943), and found a module that was no longer used at all (hypothesis.internal.classpath).

In this pull, I aim to:

  • Set up Mypy checking as a make task and add it to our Travis config
  • Make the build pass with default settings
  • Enable check_untyped_defs for all modules and make the build pass again
  • Split the workarounds for Mypy bugs into separate commits for easy reverting later

Further work after that could go in several directions based on what we feel is worth doing - from going as static as possible and locking that down, to the minimal route of leaving it as a glorified linter. This is a discussion I'd love to have in the comments!

Finally, a few words on tooling choices. I've settled on using type comments - stub files are really an all-or-nothing option, while supporting Python 2 means we can't use inline annotations. While I would love to ship type hints to users - and use them from the other side - the packaging story is being worked on but remains terrible. I suggest we simply use type hints internally and wait for the ecosystem to mature, at which point we'll hopefully have something worth distributing.

@Zac-HD
Copy link
Member Author

Zac-HD commented Oct 14, 2017

Our code is super-dynamic, Mypy can't always follow what's happening, and there's a huge volume of noisy output to wade through when enabling a new check on a directory full of code. I keep bumping into typeshed or mypy bugs too, and the novelty wears off pretty quickly.

But I think I'm also finding some actual and potential bugs in Hypothesis, which makes everything worth it. More on this later.

@Daenyth
Copy link

Daenyth commented Oct 14, 2017

You can also do module-specific configs in the ini file;

[mypy-pkg.sub.foo]
# pkg/sub/foo.py
settings = blah

@Zac-HD
Copy link
Member Author

Zac-HD commented Oct 15, 2017

I'm currently using that to ignore errors in hypothesis.vendor, since it's not our problem. I'm also planning to globally enable check_untyped_defs, and then turn it off for selected modules to get back to a manageable number of errors.

This has already bagged us #943, simply by checking that things are called with the right number of arguments.

@Daenyth
Copy link

Daenyth commented Oct 15, 2017

Really exciting that it's already catching bugs! I've found it very rewarding to use despite the limitations

@Zac-HD
Copy link
Member Author

Zac-HD commented Oct 15, 2017

Yep! I used Mypy 0.3 on my forestutils project, and managed to smoothly change the entire data model and make it lazy at the same time without tests. This is not a process I'd recommend, but it would have been utterly impossible without Mypy.

I was ambivalent about using Mypy in Hypothesis, because I expected the main value to be in improving feedback on things still in development. Catching a real bug has swung me to strong support, even with the syntactic noise of the Python 2 comment hints. Even with a relatively small set of bugs our tests don't catch, it's obviously worth the effort!

My goal in this pull is to have all the infrastructure set up, so we can have an open issue for newbies or conference sprints to annotate some modules, and work slowly towards full coverage.

@Zac-HD Zac-HD force-pushed the typecheck branch 4 times, most recently from 5cc0acd to 3170ec2 Compare October 18, 2017 11:35
@Zac-HD Zac-HD changed the title [experimental] Set up mypy and gradual static typing Set up Mypy and gradual static typing Oct 18, 2017
@Daenyth
Copy link

Daenyth commented Oct 26, 2017

@Zac-HD if you invoke mypy with warn_unused_ignores = True in the config (or as a cli argument), I find that very useful to find fixed bugs. Do note that it gives false positive warnings with incremental use; for CI I rm -rf .mypy_cache/ before running to avoid that problem. Locally, I find running it twice (or deleting the cache) removes any false positives

@Zac-HD
Copy link
Member Author

Zac-HD commented Oct 26, 2017

Yep, got that enabled! However it doesn't trigger on the hacky enum/type workaround I put in... Which indicates that we should prefer explicit ignores for upstream bugs.

@@ -108,7 +111,10 @@ def example(*args, **kwargs):
def accept(test):
if not hasattr(test, 'hypothesis_explicit_examples'):
test.hypothesis_explicit_examples = []
test.hypothesis_explicit_examples.append(Example(tuple(args), kwargs))
test.hypothesis_explicit_examples.append(
# Mypy seems to think 'Example' is one of it's internal classes,
Copy link
Member Author

@Zac-HD Zac-HD Nov 2, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO: extract this to the 'init bug' commit

if False:
from typing import Any, Set, Text # noqa

# TODO: contribute stubs with sre_parse attributes to Typeshed
Copy link
Member Author

@Zac-HD Zac-HD Nov 2, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO: fix this (import sre_constants as sre) following python/typeshed#1709 the release of Mypy 0.550, mid-November.

@DRMacIver
Copy link
Member

I'm sorry, I know you've sunk a lot of work on this, but just looking at this I really don't want to go down this route. 😞

I've got serious doubts about the mypy type system's suitability, but even setting those aside, static typing without at least local type inference is really annoying to use, and this looks especially so - the comment syntax is pretty unwieldy. Having to actually use this while developing Hypothesis would, I think, make me very unhappy.

I'm still happy to provide API level type annotations for consumers of the API, but I don't think I want our internal code annotated.

@Zac-HD
Copy link
Member Author

Zac-HD commented Nov 12, 2017

I'm sorry, I know you've sunk a lot of work on this, but just looking at this I really don't want to go down this route. 😞

I always knew this might be the case - don't worry! I've got some more experience with Mypy, explored Hypothesis internals, and fixed a number of problems upstream. My regrets do not include the sunk work 😄

I then wrote a lot more stuff (below the fold), but the gist is that I was personally surprised by how pleasant working with Mypy was - both less irritating and more useful than I'd expected. If your distaste is mostly aesthetic like mine was, maybe try writing your next PR against this branch? (it's easy enough to rebase later) If not, skip it!


I've got serious doubts about the mypy type system's suitability, but even setting those aside, static typing without at least local type inference is really annoying to use, and this looks especially so - the comment syntax is pretty unwieldy.

I'd be interested in hearing about your doubts - is there more to them than "this can only loosely express the true type constraints"? My view tends to "something is better than nothing" but I haven't really used a strongly-statically-typed language.

Mypy does have function-local type inference, it just sucks. The comment syntax also sucks.

Having to actually use this while developing Hypothesis would, I think, make me very unhappy.

I actually expected the same thing! From experience, there is some friction but not to a frustrating or even particularly irritating degree - in particular much less than my first few contributions running into isort and pyformat.

The payoff is in two main areas. First, finding bugs with a typecheck is faster than finding them while writing covering tests, and in some cases finds issues that our tests just hadn't. It's reasonable to question whether this is a net benefit, but I at least found it useful (and not too difficult, as above).

Second, it exerts a pressure towards more consistent creation and use of APIs. For particular examples, the unicode intervals code mostly returns sequences of tuples [of tuples] of elements, and nailing down the relationships between function inputs and outputs was really helpful for my understanding. Discovering and documenting the nested structure of some caches was also an exercise best done only once! This is also really useful for people who use IDEs I guess, but even without that I found reading the collection annotations could be quite useful.

I'm still happy to provide API level type annotations for consumers of the API, but I don't think I want our internal code annotated.

This will probably mean distributing stubs, according to PEP561. It looks like the ecosystem is aiming to be ready around the release of Python 3.7 in June 2018.

Annotating only the public API means that downstream users using type checking can't access anything we haven't put in the stubs. I'd rather allow them to do so - at risk of their code being broken by a patch update of course - on the pythonic basis that it's their problem anyway 😄

Note that PE561 suggests that type comments actually can't be distributed, so I'd need to port this pull to use stub files anyway.


Moving forward, I suggest that I extract the few unrelated typo fixes and clarifications into a separate commit and fold them into the doctest PR, then we close this one and I write up a summary in #200 for future reference.

@DRMacIver
Copy link
Member

This will probably mean distributing stubs, according to PEP561. It looks like the ecosystem is aiming to be ready around the release of Python 3.7 in June 2018.

Yeah, sorry, I meant more that was a thing we could do once it becomes a thing we can do (and I'm very against using typeshed for this).

@Zac-HD
Copy link
Member Author

Zac-HD commented Nov 12, 2017

Yeah, sorry, I meant more that was a thing we could do once it becomes a thing we can do (and I'm very against using typeshed for this).

Fully agreed on both counts!

@Zac-HD Zac-HD closed this Nov 21, 2017
@Zac-HD Zac-HD deleted the typecheck branch January 4, 2020 07:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants