Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow collections.abc.Callable to be used as type in python 3.9 #2519

Merged
merged 10 commits into from Sep 3, 2021

Conversation

daviskirk
Copy link
Contributor

@daviskirk daviskirk commented Mar 13, 2021

Change Summary

Allows using python 3.9 generic alias version of Callable (collections.abc.Callable)

Fixes #2465

Related issue number

Address #2465

Checklist

  • Unit tests for the changes exist
  • Tests pass on CI and coverage remains at 100%
  • Documentation reflects the changes where applicable
  • changes/<pull request or issue id>-<github username>.md file added describing change
    (see changes/README.md for details)


@pytest.mark.parametrize('annotation', [Callable, Callable[[int], int]])
collection_callable_types += [CollectionsCallable, CollectionsCallable[[int], int]]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could do this with all other generic aliases (Iterable, Sequence and so on).
However, it probably makes some of the tests less readable and I'm unsure on how much sense that would make: As far as I can tell the "typing" versions are deprecated starting 3.9, so maybe a better approach would be to just to alias them to the "new" generic alias versions for python versions >= 3.9 (except for the cases where they have obviously different behavior like the Callable here).
The old typing versions would still be tested in the old python versions so that should cover almost all cases.

@codecov
Copy link

codecov bot commented Mar 13, 2021

Codecov Report

Merging #2519 (85018eb) into master (7b7e705) will decrease coverage by 0.03%.
The diff coverage is 75.00%.

❗ Current head 85018eb differs from pull request most recent head 069e5b5. Consider uploading reports for the commit 069e5b5 to get more accurate results

@@             Coverage Diff             @@
##            master    #2519      +/-   ##
===========================================
- Coverage   100.00%   99.96%   -0.04%     
===========================================
  Files           25       25              
  Lines         5109     5114       +5     
  Branches      1050     1050              
===========================================
+ Hits          5109     5112       +3     
- Misses           0        2       +2     
Impacted Files Coverage Δ
pydantic/typing.py 98.78% <60.00%> (-1.22%) ⬇️
pydantic/fields.py 100.00% <100.00%> (ø)
pydantic/schema.py 100.00% <100.00%> (ø)
pydantic/validators.py 100.00% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 90080ba...069e5b5. Read the comment docs.

yield none_validator
return
try:
if type_ in NONE_TYPES:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we should not write a custom set

class IdentitySet(set):
    def __contains__(self, x):
        return any(x is v for v in self)

and have

NONE_TYPES = IdentitySet((None, NoneType, Literal[None]))

to support all the

whatever in NONE_TYPES

instead. @davidolrik WDYT?

Copy link
Contributor Author

@daviskirk daviskirk Mar 16, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, checking this correctly everywhere is a good idea. I tried a few variations and the custom class slowed down the benchmarks even in cython builds for me (I think this part is on the benchmark path).

I ended up just having a simple function that checks by the type, which does not seem to impact the cython benchmarks (At least I switched back and forth between the master and the new version, make build, make benchmark-pydantic and couldn't see any noticable impact.

In pure python, there might be a slight advantage to still using the try/except since the except is hardly ever called (only for non-hashable types which isn't the case very often, might even only be callables).
But testing using cython that advantage does not exist (cython probably can't optimize the try/except).

EDIT: BUT, we can't check by identity, because python 3.6 and 3.7 don't handle Literal[None] as an identity, so Literal[None] is Literal[None] turns out to be False there. So I ended up just using the try/except version but wrapping it in a function call.

Here are a few timings of JUST the function (not pydantic benchmarks).

Using tuple and identity comparison (does not work in python 3.6/3.7):

In [10]: %%cython
    ...: from typing import Any, Literal
    ...: NoneType = None.__class__
    ...: 
    ...: 
    ...: NONE_TYPES = (None, NoneType, Literal[None])
    ...: 
    ...: 
    ...: def is_none_type(type_: Any) -> bool:
    ...:     for none_type in NONE_TYPES:
    ...:         if type_ is none_type:
    ...:             return True
    ...:     return False
    ...: 

In [11]: %timeit is_none_type(None)
30.2 ns ± 0.574 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

In [12]: %timeit is_none_type(6)
33.9 ns ± 0.262 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

In [13]: %timeit is_none_type({})
50.1 ns ± 11.8 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

Using a set with try except (this is the current version):

In [6]: %%cython
   ...: from typing import Any, Literal
   ...: NoneType = None.__class__
   ...: 
   ...: 
   ...: NONE_TYPES: Set[Any] = {None, NoneType, Literal[None]}
   ...: 
   ...: 
   ...: def is_none_type(type_: Any) -> bool:
   ...:     try:
   ...:         return type_ in NONE_TYPES
   ...:     except TypeError:
   ...:         return False
   ...: 
   ...: 

In [7]: %timeit is_none_type(None)
50 ns ± 12.9 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

In [8]: %timeit is_none_type(6)
40.9 ns ± 0.546 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

In [9]: %timeit is_none_type({})
383 ns ± 121 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

For comparison: direct set "contains" (as it was before, but can't handle types like callable and so on), here we probably just save the function call:

In [14]: NONE_TYPES: Set[Any] = {None, NoneType, Literal[None]}

In [15]: %timeit None in NONE_TYPES
23.7 ns ± 0.201 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

In [16]: %timeit 6 in NONE_TYPES
25 ns ± 0.0713 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

@daviskirk
Copy link
Contributor Author

FYI, the fast api tests fail because sqlalchemy 1.4 was released and I'm assumig fastapi didn't pin their test dependencies

@PrettyWood
Copy link
Member

PrettyWood commented Mar 16, 2021

Yep the little is_none_type is perfect thanks!
As for the CI I opened encode/databases#299 but still need to work on it

Copy link
Member

@PrettyWood PrettyWood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Copy link
Member

@samuelcolvin samuelcolvin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

otherwise LGTM

pydantic/typing.py Outdated Show resolved Hide resolved
try:
return type_ in NONE_TYPES
except TypeError:
return False
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From your benchmarks (and mine), the following seems to be faster:

NONE_TYPES: Tuple[Any, Any, Any] = (None, NoneType, Literal[None])


def is_none_type(type_: Any) -> bool:
    for none_type in NONE_TYPES:
        if type_ is none_type:
            return True
    return False

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

my benchmarks:

In [1]: %load_ext cython

In [2]: %%cython
   ...: from typing import Any, Literal, Tuple, Set
   ...: NoneType = None.__class__
   ...: 
   ...: 
   ...: NONE_TYPES_TUPLE: Tuple[Any, Any, Any] = (None, NoneType, Literal[None])
   ...: 
   ...: 
   ...: def is_none_type1(type_: Any) -> bool:
   ...:     for none_type in NONE_TYPES_TUPLE:
   ...:         if type_ is none_type:
   ...:             return True
   ...:     return False
   ...: 
   ...: 
   ...: NONE_TYPES_SET: Set[Any] = {None, NoneType, Literal[None]}
   ...: 
   ...: 
   ...: def is_none_type2(type_: Any) -> bool:
   ...:     try:
   ...:         return type_ in NONE_TYPES_SET
   ...:     except TypeError:
   ...:         return False
   ...: 
   ...: 
   ...: def is_none_type3(type_: Any) -> bool:
   ...:     return type_ is None or type_ is NoneType or type_ is Literal[None]
   ...: 

In [3]: print('None:')
   ...: %timeit is_none_type1(None)
   ...: %timeit is_none_type2(None)
   ...: %timeit is_none_type3(None)
   ...: 
   ...: print('type(None):')
   ...: %timeit is_none_type1(type(None))
   ...: %timeit is_none_type2(type(None))
   ...: %timeit is_none_type3(type(None))
   ...: 
   ...: print('Literal[None]:')
   ...: %timeit is_none_type1(Literal[None])
   ...: %timeit is_none_type2(Literal[None])
   ...: %timeit is_none_type3(Literal[None])
   ...: 
   ...: print('6:')
   ...: %timeit is_none_type1(6)
   ...: %timeit is_none_type2(6)
   ...: %timeit is_none_type3(6)
   ...: 
   ...: print('{}:')
   ...: %timeit is_none_type1({})
   ...: %timeit is_none_type2({})
   ...: %timeit is_none_type3({})
   ...: 
None:
49.5 ns ± 0.727 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
65.4 ns ± 1.14 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
43.3 ns ± 0.775 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
type(None):
99.5 ns ± 1.52 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
118 ns ± 1.86 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
92.3 ns ± 1.83 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
Literal[None]:
337 ns ± 5.8 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
1.08 µs ± 16.3 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
582 ns ± 11.3 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
6:
52.6 ns ± 0.762 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
64.5 ns ± 1.48 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
313 ns ± 2.64 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
{}:
65 ns ± 1.05 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
458 ns ± 3.09 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
323 ns ± 1.59 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

I have absolutely no idea why is_none_type3 is so much slower than is_none_type1, but it does seem to be.

Copy link
Contributor Author

@daviskirk daviskirk May 9, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@samuelcolvin sorry, I'm not sure if we're on the same page here.
I would also like to use the tuple version (is_none_type1 in your comment),

But this

EDIT: BUT, we can't check by identity, because python 3.6 and 3.7 don't handle Literal[None] as an identity, so Literal[None] is Literal[None] turns out to be False there. So I ended up just using the try/except version but wrapping it in a function call.

was the reason I didn't want to do it (not sure if you saw that part of my comment).

What I ended up doing now is to add another version switch (python < 3.8). Makes the code a little less readable, but I guess it's not the end of the world and is easy to remove when 3.7 goes out of scope. Should get us the best of both worlds (fastest in 3.8 and 3.9 but still works in 3.6/3.7).
Hope that's an ok solution.

@samuelcolvin
Copy link
Member

also I noticed coverage is not complete, please update.

daviskirk added a commit to daviskirk/pydantic that referenced this pull request May 9, 2021
* Apply review comments on pydantic#2519
* Add different implementations depending on python version
* Add tests for is_none_type
@daviskirk
Copy link
Contributor Author

please review

@PrettyWood PrettyWood dismissed samuelcolvin’s stale review September 3, 2021 22:15

As @daviskirk you're both on the same page. Faster implementation is used when possible. It will be easy for v2 to drop the slow implementation with drop of 3.6, 3.7 maybe other versions

@PrettyWood PrettyWood merged commit 3d28759 into pydantic:master Sep 3, 2021
jpribyl pushed a commit to liquet-ai/pydantic that referenced this pull request Oct 7, 2021
…ntic#2519)

* Allow collections.abc.Callable to be used as type in python 3.9

* Add is_none_type as function to check none types by identity

* Modify `typing.is_none_type` to work in python 3.6 and 3.7

* Add tests for none types in typing.py

* Apply review comments on pydantic#2519
* Add different implementations depending on python version
* Add tests for is_none_type

* Add change entry

* Fix field info repr

* Remove unneeded try/except for python < 3.8

* Add comment explaining alternative is_none_type implementation

* fix: typo

Co-authored-by: PrettyWood <em.jolibois@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

python 3.9 generic alias Callable from collections module does not behave correctly in generics
3 participants