Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Setup mypy in tox -e typing and get it to pass #892

Merged
merged 9 commits into from Jan 6, 2022

Conversation

sirosen
Copy link
Member

@sirosen sirosen commented Dec 10, 2021

This is the smallest possible change to get mypy passing on the jsonschema codebase. The goal of this configuration is to enforce type annotations anywhere that they appear.
That is, if a method is added to the codebase,

def foo(x: int) -> str:
    return str(x)

then usages of foo will by type checked. If no annotations are added, mypy will not type check functions.

For the most part, this keeps the impact low. The one exceptional case is the use of pyrsistent.pmap as an argument to attr.ib(converter=...). Unfortunately, it causes mypy to incorrectly deduce the type of the init parameter created by attrs. We need to "explain the type of init" to mypy by creating a callable with a concrete type to act as the converter. The callable in question simply wraps pmap with a cast and presents the desired type
information to mypy.


A few asides:

I see some whitespace edits crept in when I blacked files. I'll back those out when I can work on this more next week.

I personally like import typing as t and then using t.Optional, t.Dict, etc. throughout a codebase. But there are various opinions on this, so I've gone with import typing and typing.Optional, etc. as what I believe is the least controversial choice.

tox -e typing seemed consistent with tox -e style: name the env by the thing being checked, not the tool used to check it. tox -e mypy would also work.

Type checking enforcement is somewhat necessary if we are to consider publishing type information from jsonschema. I don't know if this is really desirable, but the PR shows what it would take to get type checking to pass.

@codecov
Copy link

codecov bot commented Dec 10, 2021

Codecov Report

Merging #892 (bc4f2d5) into main (fc0990a) will decrease coverage by 1.37%.
The diff coverage is 87.09%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main     #892      +/-   ##
==========================================
- Coverage   98.27%   96.90%   -1.38%     
==========================================
  Files          20       20              
  Lines        3188     3259      +71     
  Branches      430      446      +16     
==========================================
+ Hits         3133     3158      +25     
- Misses         44       79      +35     
- Partials       11       22      +11     
Impacted Files Coverage Δ
jsonschema/_utils.py 97.05% <0.00%> (-1.15%) ⬇️
jsonschema/cli.py 97.01% <0.00%> (ø)
jsonschema/tests/test_cli.py 98.03% <0.00%> (-1.97%) ⬇️
jsonschema/protocols.py 86.95% <83.33%> (-0.55%) ⬇️
jsonschema/_format.py 93.98% <100.00%> (+0.05%) ⬆️
jsonschema/_types.py 100.00% <100.00%> (ø)
jsonschema/exceptions.py 97.91% <100.00%> (+0.01%) ⬆️
jsonschema/tests/test_validators.py 98.09% <100.00%> (-0.84%) ⬇️
jsonschema/validators.py 97.54% <100.00%> (+0.01%) ⬆️
jsonschema/tests/test_jsonschema_test_suite.py 82.35% <0.00%> (-17.65%) ⬇️
... and 6 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update fc0990a...bc4f2d5. Read the comment docs.

@sirosen sirosen force-pushed the add-mypy-config branch 3 times, most recently from 1516dde to eea2f63 Compare December 16, 2021 19:30

import attr

from jsonschema import _utils

WEAK_MATCHES = frozenset(["anyOf", "oneOf"])
STRONG_MATCHES = frozenset()
WEAK_MATCHES: typing.FrozenSet[str] = frozenset(["anyOf", "oneOf"])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Ignorance question, I'll have to read myself a bit more before merging) but is there a more generic type here than this? I'd like to not say this is a frozenset, in the future it may (likely will) become a pyrsistent.pset. So just something like collections.abc.Set or whatever the typing equivalent is would be a bit preferable.

Copy link
Member Author

@sirosen sirosen Dec 20, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the best we can do is typing.Collection[str], which is the type for collections.abc.Collection.

typing.Set doesn't work, as it is the type for builtins.set, not collections.abc.Set.
Most of collections.abc is mirrored in typing, and even collections.abc.MutableSet maps to typing.MutableSet. I guess Set was skipped because of the likelihood of confusion with the builtin set type.

In py3.9+, all of the collections.abc types are usable as generics, so we'd be able to write

WEAK_MATCHES: collections.abc.Set[str] = ...

and I would expect it to work.

The big alternative, which I could apply here and elsewhere, is to use from __future__ import annotations, now that py3.6 has been dropped. The future import enables several features and behaviors, but the one that is relevant here is that it allows the use of builtin types as generics in annotations. So we can do

from __future__ import annotations

WEAK_MATCHES: frozenset[str] = ...

If you prefer the results of the future import, I can apply it to all of the places where I had to add annotations.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah let's go with the future import! After that I think I'll merge. Thanks again for the help here!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've just gotten around to doing this. Wherever it simplified things, I added a from __future__ import annotations.

Let me know if there are any issues with it, but I think we're good to go on this! 😄

@@ -1662,7 +1663,7 @@ def test_False_is_not_a_schema_even_if_you_forget_to_check(self):

class TestDraft3Validator(AntiDraft6LeakMixin, ValidatorTestMixin, TestCase):
Validator = validators.Draft3Validator
valid = {}, {}
valid: typing.Tuple[dict, dict] = ({}, {})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I take it mypy complains if you don't add these types here? (If not I don't see much value in adding them here for typing the test case fixtures, the real type is in fact the type of schemas, i.e. Union[dict, bool] right now, but possibly some larger union type later, but if it's simpler to add than ignore, then yeah leave them).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, unfortunately. mypy generally complains when it sees unannotated empty dict assignments.
Adding this annotation seemed the simplest way to pacify it.

We could also add # typing: ignore comments. For tests I think it would be appropriate (I am against the use of ignore comments in most cases, since it's just as verbose as an annotation and leaves the type checker with less information).
If we don't want type checking on tests at all, we can add exclude = test_* to the mypy config, and it will exclude all test_* module names. I could do that and revert any changes to test modules, if that is preferable.

@sirosen
Copy link
Member Author

sirosen commented Jan 3, 2022

I'm going to have to make further tweaks. I forgot about how the future import behaves.
x = Union[int, str] is valid in all (current) language versions but x = int | str is only valid in py3.10+ even with the future import.

Sorry for the noise; hopefully I'll have it sorted out shortly.

This is the smallest possible change to get mypy passing on the
jsonschema codebase. The goal of this configuration is to enforce type
annotations anywhere that they appear.
That is, if a method is added to the codebase,

    def foo(x: int) -> str:
        return str(x)

then usages of `foo` will by type checked. If no annotations are
added, `mypy` will not type check functions.

For the most part, this keeps the impact low. The one exceptional case
is the use of `pyrsistent.pmap` as an argument to
`attr.ib(converter=...)`. Unfortunately, it causes `mypy` to
incorrectly deduce the type of the init parameter created by attrs. We
need to "explain the type of init" to mypy by creating a callable with
a concrete type to act as the converter. The callable in question
simply wraps `pmap` with a cast and presents the desired type
information to mypy.
mypy can handle `if sys.version_info` checking better than
`try ... except ImportError`. This is somewhat disappointing, but the
rewrite of these import lines isn't that bad and aligns with the
recommendations of the mypy docs.
In pypy3, the following line is invalid:

    x: Tuple[dict, dict] = {}, {}

but this variant *is* valid:

    x: Tuple[dict, dict] = ({}, {})

Apply this correction where it occurs in test_validators.py
Use of the `__future__` import of annotations allows several niceties,
in particular:
- parametrization of builtin types as generics
- `|` syntax for unions (including `| None` for optionals)

Update to use the future import wherever it improves or simplifies
annotations.

Avoid using new typing features outside of annotations (e.g. in
assignment), which fails on older pythons. This is an unfortunate
wart in the way that the future import works.
Nitpick was failing on

    TYPE_CHECKER: ClassVar[TypeChecker]

It's not clear what to do about this. `TypeChecker` was imported
correctly from `jsonschema._types` and is noted in the doc as
`jsonschema.TypeChecker`. We can't import the `jsonschema` name at
runtime because that would be circular.

To resolve, use `typing.TYPE_CHECKING` to conditionally import
`jsonschema` at type-checking time. This avoids the circular
import but allows us to write

    TYPE_CHECKER: ClassVar[jsonschema.TypeChecker]

As a result, Sphinx correctly builds a cross-reference from the
annotation, the annotation is still accurate, and runtime behavior
is left untouched.
Type annotations can use variable names in order to support easily
named constructs (e.g. `FooType = Union[int, str]; x: FooType`).
However, such variable names are then seen by sphinx autodoc, which
does not evaluate them. As a result, for such a name to avoid tripping
nitpick warnings, these names need to be resolvable.

The simplest resolution is to remove the use of any internal variables
for this purpose (at least where they would be seen by sphinx), and
use the longhand description of types in such cases.
`_format.py` had a few internal variables to make type annotations
shorter to write. However, these can cause issues with sphinx
nitpick. Even though the variables removed in this commit aren't
causing a build failure, removing them is a good precaution.
These blocks only exist to create code which evaluates under mypy
but does not run at runtime. As a result, they are impossible to cover
with `coverage` on a testsuite, and make sense to exclude from
coverage reporting.
@sirosen
Copy link
Member Author

sirosen commented Jan 5, 2022

There were a few issues with the doc build which I think I have sorted out. I did some amends and rebases to try to keep a useful commit history.

631fba1 deals with issues arising from sphinx trying to build a cross-reference for ClassVar[TypeChecker]. Because it can't resolve TypeChecker to jsonschema.TypeChecker, it needs some special hand-holding. The commit message explains in more detail. If there's a better way to give sphinx context, I'm happy to adjust.

17384e7 handles a related but different case (which was not flagged in my local builds because I was running tox under py3.8, and for whatever reason it only shows up on 3.9+). If you write FooType = Union[int, str]; x: FooType, that's valid and means what it looks like. However, when sphinx sees FooType, it will try to cross-reference that, rather than using the value. The easiest solution is to just avoid these kind of type assignments and use the desired type directly, even if it's a little more verbose.

cdc2807 does the same kind of cleanup as 17384e7, but in a location where sphinx was not seeing the variable names. It's purely precautionary.

811bab2 omits type-checking-only code from coverage reporting, to try to get the coverage check to pass. It doesn't seem to have been sufficient, but I think it still makes sense to include.

As I write this, the build is still going and the coverage check is still red, but I expect all of it to pass.

@Julian
Copy link
Member

Julian commented Jan 6, 2022

Awesome! Don't worry too much about the coverage check, I'll have a look. Appreciated.

@Julian Julian merged commit eed6d8b into python-jsonschema:main Jan 6, 2022
@Julian
Copy link
Member

Julian commented Jan 6, 2022

Odd (and worrying) that things pass locally in Sphinx even the way you originally had it, and on the same versions of everything via tox. But I don't have time to diagnose why at the minute... Going to merge. Thanks again for the PR!

@sirosen sirosen deleted the add-mypy-config branch January 6, 2022 19:50
@sirosen
Copy link
Member Author

sirosen commented Jan 6, 2022

Odd (and worrying) that things pass locally in Sphinx even the way you originally had it

I agree.

I'm generally a big fan of the new typing stuff, but this branch seemed to run into almost all of my least-favorite things -- typing-time code in if TYPE_CHECKING, idiosyncratic rules about how to do conditional imports, "future annotations" which the runtime can't handle, ...

Please feel free to call me in to help out if any of this creates problems down the line. And thanks for being receptive to these changes!

@Julian
Copy link
Member

Julian commented Jan 6, 2022

I very well may take you up on that offer, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants