Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Harmonize annotations with python/typeshed#8608 #1027

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
14 changes: 14 additions & 0 deletions jsonschema/_typing.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
"""
Type aliases and utilities to help manage `typing` usage and type annotations.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Definitely support adding this, thanks!)

"""

from collections.abc import Mapping, Sequence
import typing

Schema = Mapping[str, typing.Any]
Copy link
Member

@Julian Julian Dec 7, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Schema is tricky to nail down for multiple reasons. Its precise type depends on the specific dialect of JSON Schema involved, and (speaking off the top of my head without double checking, which counts for lots of the comments I'm about to make --)

  • applicable_validators means that technically its type is literally Any (or some dependent type), because someone can decide they're inventing a JSON Schema dialect where the number 17 is a schema and means something, and applicable_validators will give the library what it needs to deal with that as a schema. I'm honestly not sure what to do about that.
  • Ignoring the above, the "internal" type to all JSON Schema "official" drafts is -- for later drafts, bool | Mapping[str, typing.Any], and for earlier ones Mapping[str, typing.Any]

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm definitely going to have to circle back to try to better grasp what's sensible here.

In line with my other comment about maybe accepting some technical inaccuracy in the annotations as a pragmatic approach, I'm instinctively inclined towards bool | Mapping[...].


JsonObject = Mapping[str, typing.Any]

JsonValue = typing.Union[
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is incomplete -- someone can come along and invent validator callbacks which only work if someone deserialized into Decimal, and as long as they ensure they've done so in their loading of JSON, their code works. In other words, I'm somewhat afraid this is really just Any in "real life".

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we might have a tension here between annotations which are useful vs ones which are completely accurate.

If we annotate things as Any, we ensure that nothing which is correct at runtime would be flagged by a type checker. But I'm concerned that making this accurate with Any will also mean that we can't catch common errors.
My example would be some flask-like framework, and checking validate(request) vs validate(request.json).

Is it okay to suggest that anyone using type checking needs to use typing.cast or type: ignore in these cases?

The type really is Any for a lot of methods like json.load, but I hope my desire to provide a more tightly scoped type is understandable.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have to think about what I think honestly.

To me, all I am "convinced" is useful of at the minute (and I'm quite convinced it's very useful indeed) is that typing forces documentation to have correctly documented types. That's the reason I'm quite picky about making sure that things which work "accidentally" because we abuse some type internally do not in fact get represented as the type -- because later down the line I will not support those internal abuses if we need to change them for whatever reason. They're not part of the public API, they don't have stability guarantees.

Here, the opposite is true -- not documenting the type here actually means "regressing" something that is indeed public API, and which I very much do intend folks to rely on (by which I don't mean I have no reservations that it will bite me in the ass, but I do mean it's something I think people should consider public API at this point, type annotation or not, and I don't intend to break it).

And I know we don't actually regress, since I'm pretty sure there are tests covering the Any behavior at least in part -- but if the goal (at least the goal in my head) is "be able to say every object in jsonschema has its type documented explicitly", it complicates that goal if the documentation again needs tweaking to broaden types into their real public API.

(So... "I don't know yet" :D)

JsonObject, Sequence[typing.Any], str, int, float, bool, None
]
24 changes: 16 additions & 8 deletions jsonschema/validators.py
Original file line number Diff line number Diff line change
Expand Up @@ -23,13 +23,19 @@
_format,
_legacy_validators,
_types,
_typing,
_utils,
_validators,
exceptions,
)

_UNSET = _utils.Unset()

_ValidatorCallback = typing.Callable[
[typing.Any, typing.Any, _typing.JsonValue, _typing.JsonObject],
typing.Iterator[exceptions.ValidationError],
]

_VALIDATORS: dict[str, typing.Any] = {}
_META_SCHEMAS = _utils.URIDict()
_VOCABULARIES: list[tuple[str, typing.Any]] = []
Expand Down Expand Up @@ -114,13 +120,15 @@ def _store_schema_list():


def create(
meta_schema,
validators=(),
meta_schema: _typing.Schema,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have to double check (which I will be lazy and not do here) but Unlike _typing.Schema, meta-schemas (in the library, but possibly also according to the spec, I forget) must be objects I think, not bool | thing, so you'll need two types I suspect?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, that makes sense. I need to think harder about what to do with schemas more broadly, but we now have a clean place where we can define _typing.MetaSchema. I'll make that tweak.

validators: Mapping[str, _ValidatorCallback] | tuple[()] = (),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The public API is just Mapping[str, _ValidatorCallback], not tuple. I know mypy or whatever is unhappy with that, so if that means just adding type: ignore to get it to shut up about the default arg let's do that.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, I'm cool with that way of handling these!

I had some other ideas which I like less, but another option just occured to me.

With _typing we could do this:

# in the definition
validators: Mapping[str, _ValidatorCallback] = _typing.EmptySequence

# in _typing
EmptySequence = typing.cast(typing.Any, ())

Because Any is a subtype of everything, the rules get wonky, and mypy and other checkers should be okay with this.
That way, _typing.EmptySequence can be used in any location where the empty tuple is used today.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That seems also fine with me (although I think the fact that () works is more a quirk of dict than a generic empty sequence type, but I'll leave pedantry out until/unless it comes back to rear its head I think).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it help that I also considered

EmptyMap = pyrsistent.freeze({})

?
😁

I just know that this pattern occurs several times in the code, with () as an empty default. So it seems like it might be nice to sweep any messiness associated with it into _typing.

version=None,
type_checker=_types.draft202012_type_checker,
format_checker=_format.draft202012_format_checker,
id_of=_id_of,
applicable_validators=methodcaller("items"),
type_checker: _types.TypeChecker = _types.draft202012_type_checker,
format_checker: _format.FormatChecker = _format.draft202012_format_checker,
id_of: typing.Callable[[_typing.Schema], str] = _id_of,
applicable_validators: typing.Callable[
[_typing.Schema], typing.Iterable[tuple[str, _ValidatorCallback]]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(This has to do with the above, the type for applicable_validators is really Any -> tuple(...))

] = methodcaller("items"),
):
"""
Create a new validator class.
Expand Down Expand Up @@ -186,7 +194,7 @@ def create(
@attr.s
class Validator:

VALIDATORS = dict(validators)
VALIDATORS: dict[str, _ValidatorCallback] = dict(validators)
META_SCHEMA = dict(meta_schema)
TYPE_CHECKER = type_checker
FORMAT_CHECKER = format_checker_arg
Expand Down Expand Up @@ -339,7 +347,7 @@ def is_valid(self, instance, _schema=None):
if version is not None:
safe = version.title().replace(" ", "").replace("-", "")
Validator.__name__ = Validator.__qualname__ = f"{safe}Validator"
Validator = validates(version)(Validator)
Validator = validates(version)(Validator) # type: ignore[misc]

return Validator

Expand Down