-
Hi, I have written a custom protocol class intended to check one-dimensional, indexable array-like object (read: list, tuple, np.array, pd.Series...). It looks like this: from typing import TypeVar, Protocol, Iterable, runtime_checkable
T = TypeVar("T")
@runtime_checkable
class Sequence(Protocol[T]):
def __iter__(self) -> Iterable[T]: ...
def __getitem__(self, item) -> T: ...
def __len__(self) -> int: ... When using typebear, I can just check the class and it will correctly check if the requested dunder methods are present, but is it possible to also check the content of the sequence, class T? is_bearable([1, 2], Sequence) # Passes, good
is_bearable([1, 2], Sequence[int]) # Passes, good
is_bearable([1, 2], Sequence[str]) # Passes, but I want it to fail! Also: This protocol would also pass any string because it has the required dunder methods. Any way to avoid that? is_bearable("test", Sequence) # Passes, but I want it to fail! Thanks in advance:) |
Beta Was this translation helpful? Give feedback.
Replies: 8 comments 6 replies
-
So, I love issues like this. To answer your poignant question, @tvdboom...
The answers are... sort of and it's complicated. First, let's talk @beartype. Technically, @beartype could try to implicitly detect which of the 25 existing standard abstract base classes (ABCs) defined by the Understandably, @beartype doesn't do that. Instead, @beartype embraces Python's "Explicit is better than implicit" maxim by requiring you to explicitly declare which standard ABCs your protocol satisfies. In theory, that would be trivial. Just subclass your from collections.abc import Sequence as SequenceABC
from typing import TypeVar, Protocol, Iterable, runtime_checkable
T = TypeVar("T")
@runtime_checkable
class Sequence(SequenceABC, Protocol[T]):
def __iter__(self) -> Iterable[T]: ...
def __getitem__(self, item) -> T: ...
def __len__(self) -> int: ... Of course, that doesn't work. Why? Because the current implementation of Traceback (most recent call last):
File "/home/leycec/tmp/mopy.py", line 11, in <module>
class Sequence(SequenceABC, Protocol[T]):
File "<frozen abc>", line 106, in __new__
File "/usr/lib/python3.11/typing.py", line 2094, in __init_subclass__
raise TypeError('Protocols can only inherit from other'
TypeError: Protocols can only inherit from other protocols, got <class 'collections.abc.Sequence'> "...uhh. wat?", you may now be thinking. Yes, it is all true. Even though the private # In the standard "typing.py" module bundled with Python:
_PROTO_ALLOWLIST = {
'collections.abc': [
'Callable', 'Awaitable', 'Iterable', 'Iterator', 'AsyncIterable',
'Hashable', 'Sized', 'Container', 'Collection', 'Reversible',
],
'contextlib': ['AbstractContextManager', 'AbstractAsyncContextManager'],
} Noticeably, In fact, there's even a well-known Python issue about this very topic. Everything is Bad and Here's the ProofOne obvious "solution" would be to just monkey-patch the from beartype.door import is_bearable
from collections.abc import Sequence as SequenceABC
from typing import TypeVar, Protocol, Iterable, runtime_checkable
import typing
typing._PROTO_ALLOWLIST = {
'collections.abc': [
'Callable', 'Awaitable', 'Iterable', 'Iterator', 'AsyncIterable',
'Hashable', 'Sized', 'Container', 'Collection', 'Reversible',
'Sequence',
],
'contextlib': ['AbstractContextManager', 'AbstractAsyncContextManager'],
}
T = TypeVar("T")
@runtime_checkable
class Sequence(SequenceABC[T], Protocol[T]):
def __iter__(self) -> Iterable[T]: ...
def __getitem__(self, item) -> T: ...
def __len__(self) -> int: ...
print(is_bearable([1, 2], Sequence[int]))
print(is_bearable([1, 2], Sequence[str])) ...which prints: True # <-- good. this is good.
True # <-- GAH!! BURN MY EYES!! Not Everything is Bad and Here's the ProofUgh. Let's try a different tact. What if we replace the from beartype.door import is_bearable
from collections.abc import Sequence as SequenceABC
from typing import TypeVar, Protocol, Iterable, runtime_checkable
import typing
typing._PROTO_ALLOWLIST = {
'collections.abc': [
'Callable', 'Awaitable', 'Iterable', 'Iterator', 'AsyncIterable',
'Hashable', 'Sized', 'Container', 'Collection', 'Reversible',
'Sequence', # <-- We didn't make crazy. We just fix crazy.
],
'contextlib': ['AbstractContextManager', 'AbstractAsyncContextManager'],
}
T = TypeVar("T")
@runtime_checkable
class SequenceInt(SequenceABC[int], Protocol[T]):
def __iter__(self) -> Iterable[T]: ...
def __getitem__(self, item) -> T: ...
def __len__(self) -> int: ...
@runtime_checkable
class SequenceStr(SequenceABC[str], Protocol[T]):
def __iter__(self) -> Iterable[T]: ...
def __getitem__(self, item) -> T: ...
def __len__(self) -> int: ...
print(is_bearable([1, 2], SequenceInt))
print(is_bearable([1, 2], SequenceStr)) ...which prints: True # <-- good. this is good.
False # <-- YES!! YES!!!! BY ODIN, WE DID SOMETHING THAT WORKS!!!!!! Technically, that appears to work. Pragmatically, we had to monkey-patch private internals of the
|
Beta Was this translation helpful? Give feedback.
-
Yup. Absolutely. Unfortunately, this one is on NumPy and Pandas – both of which have long-standing open issues about this exact topic: The issue with Pandas is mostly backward compatibility. The The issue with NumPy is a little more subtle. Even if NumPy devs had wanted to originally subclass the
And so on and so on. That said... I Totally Get What You're Trying to Do HereI get it, @tvdboom. It's actually a really smart idea – and one that NumPy devs themselves have floated about before:
Basically, your What's really needed is for somebody to just do this already. This is easier said than done. None of the people who cared about this did this, which is concerning. Still, the core idea has merit. Somebody who is not me should author a new PEP standardizing a new # In "collections.abc":
from typing import TypeVar
T = TypeVar("T")
class SequenceBase(Collection[T]):
@abstractmethod
def __iter__(self) -> Iterable[T]: ...
@abstractmethod
def __getitem__(self, item) -> T: ...
@abstractmethod
def __len__(self) -> int: ... Both I am stroking my chin thoughtfully while peering into the distance. 🤔
...yeah. That pretty much sucks, doesn't it? This is a surprisingly non-trivial topic. The issue, really, is that all of these third-party numeric frameworks like NumPy and Pandas reinvented the wheel rather than playing nicely one with another and the standard Python library. From my experience, SymPy is the only one that gets this right. SymPy containers and scalars conform to the standard Python ABCs and just behave as expected out-of-the-box. Serenity now, NumPy and Pandas!
|
Beta Was this translation helpful? Give feedback.
-
...heh. Yes, that is appalling. You're not alone in that dark realization either. The idea here is that you'd define a class SequenceProtocol(Protocol[T]):
def __iter__(self) -> Iterable[T]: ...
def __getitem__(self, item) -> T: ...
def __len__(self) -> int: ...
@classmethod
def __beartype_violation_message__(cls, hint: object, obj: object) -> str:
if not isinstance(obj, cls):
return 'not a sequence'
# Child type hint subscripting this protocol (e.g., "int" for the
# parent type hint "SequenceProtocol[int]").
item_hint = hint.__args__[0]
for item_index, item in enumerate(obj):
if not is_bearable(item, item_hint):
return f'item {item_index} {repr(item)} not {repr(item_hint)}' Something like that, maybe? Naturally, nothing like that currently exists. This is why cats cry. 😿 |
Beta Was this translation helpful? Give feedback.
-
Oh – and I belatedly realized that you can actually get static type-checking support from mypy and from typing import TYPE_CHECKING
class SequenceProtocol(Protocol[T]):
def __iter__(self) -> Iterable[T]: ...
def __getitem__(self, item) -> T: ...
def __len__(self) -> int: ...
if TYPE_CHECKING:
SequenceProtocolOf = SequenceProtocol
else:
class SequenceProtocolOf(object):
'''
Type hint factory class dynamically creating and returning new
``Annotated[SequenceProtocol[...], ...]`` type hints, subscripted by the
passed type.
Parameters
----------
X : object
Arbitrary child type hint with which to subscript the
:class:`SequenceProtocol` protocol.
Returns
----------
Annotated
``Annotated[SequenceProtocol[X], ...]`` type hint validating that *all*
items of this sequence satisfy this child type hint.
'''
@classmethod
def __class_getitem__(cls, X: object) -> Annotated:
return Annotated[SequenceProtocol[X], Is[
lambda lst: all(is_bearable(i, X) for i in lst)]] It's rough stuff, but that's just how we roll in the QA trenches. |
Beta Was this translation helpful? Give feedback.
-
Hi, another question that would make the solution mentioned before even better. In your answer class SequenceProtocolOf(object):
'''
Type hint factory class dynamically creating and returning new
``Annotated[SequenceProtocol[...], ...]`` type hints, subscripted by the
passed type.
Parameters
----------
X : object
Arbitrary child type hint with which to subscript the
:class:`SequenceProtocol` protocol.
Returns
----------
Annotated
``Annotated[SequenceProtocol[X], ...]`` type hint validating that *all*
items of this sequence satisfy this child type hint.
'''
@classmethod
def __class_getitem__(cls, X: object) -> Annotated:
return Annotated[SequenceProtocol[X], Is[
lambda lst: all(is_bearable(i, X) for i in lst)]]
print(is_bearable([1, 2], SequenceProtocol))
print(is_bearable([1, 2], SequenceProtocolOf[int]))
print(is_bearable([1, 2], SequenceProtocolOf[str]))
print(is_bearable([[1, 2], [3, 4]], SequenceProtocolOf[SequenceProtocolOf[int]]))
print(is_bearable([[1, 2], [3, 4]], SequenceProtocolOf[SequenceProtocolOf[str]])) I didn't notice the first print statemenrt actually checks for the What I would like is |
Beta Was this translation helpful? Give feedback.
-
OMG!!! I can't believe this actually works... but it does. The CPython devs responsible for the standard class SequenceProtocol(Protocol[T]):
'''
Type hint factory class dynamically creating and returning new
``Annotated[SequenceProtocol[...], ...]`` type hints, subscripted by the
passed type.
Parameters
----------
X : object
Arbitrary child type hint with which to subscript the
:class:`SequenceProtocol` protocol.
Returns
----------
Annotated
``Annotated[SequenceProtocol[X], ...]`` type hint validating that *all*
items of this sequence satisfy this child type hint.
'''
def __iter__(self) -> Iterable[T]: ...
def __getitem__(self, item) -> T: ...
def __len__(self) -> int: ...
@classmethod
def __class_getitem__(cls, X: object) -> Annotated:
return Annotated[cls, Is[
lambda lst: all(is_bearable(i, X) for i in lst)]]
print(is_bearable([1, 2], SequenceProtocol))
print(is_bearable([1, 2], SequenceProtocol[int]))
print(is_bearable([1, 2], SequenceProtocol[str]))
print(is_bearable([[1, 2], [3, 4]], SequenceProtocol[SequenceProtocol[int]]))
print(is_bearable([[1, 2], [3, 4]], SequenceProtocol[SequenceProtocol[str]])) ...which prints: True
True
False
True
False 🤣 The only issue, of course, is that doesn't actually do what you want. It just looks like it does what you want. Although standard Python sequences like To get this to work for all possible third-party sequence types, we would somehow need to exclude our # Looks pretty weirdo, bro.
from beartype.typing import omit_method_from_protocol_api
class SequenceProtocol(Protocol[T]):
def __iter__(self) -> Iterable[T]: ...
def __getitem__(self, item) -> T: ...
def __len__(self) -> int: ...
@omit_method_from_protocol_api # <-- so weirdo, so rando
@classmethod
def __class_getitem__(cls, X: object) -> Annotated:
return Annotated[cls, Is[
lambda lst: all(is_bearable(i, X) for i in lst)]] 😭 We are now bumping up against the finite limit of bugs in the @beartype and Python codebases. |
Beta Was this translation helpful? Give feedback.
-
I think the is_bearable("teststring", SequenceProtocol) # -> True Also, unfortunately, the accepted answer is still not working as we thought it did. It works for list, tuple and pd.Series, but not for numpy arrays. from beartype.door import is_bearable
from beartype.vale import Is
from beartype.typing import Annotated, TypeVar, Protocol, Iterable
T = TypeVar("T")
class SequenceProtocol(Protocol[T]):
def __iter__(self) -> Iterable[T]: ...
def __getitem__(self, item) -> T: ...
def __len__(self) -> int: ...
class SequenceProtocolOf(object):
'''
Type hint factory class dynamically creating and returning new
``Annotated[SequenceProtocol[...], ...]`` type hints, subscripted by the
passed type.
Parameters
----------
X : object
Arbitrary child type hint with which to subscript the
:class:`SequenceProtocol` protocol.
Returns
----------
Annotated
``Annotated[SequenceProtocol[X], ...]`` type hint validating that *all*
items of this sequence satisfy this child type hint.
'''
@classmethod
def __class_getitem__(cls, X: object) -> Annotated:
return Annotated[SequenceProtocol[X], Is[
lambda lst: all(is_bearable(i, X) for i in lst)]]
print(is_bearable(np.array([1, 2]), SequenceProtocol))
print(is_bearable(np.array([1, 2]), SequenceProtocolOf[int]))
print(is_bearable(np.array([[1, 2], [3, 4]]), SequenceProtocolOf[SequenceProtocolOf[int]])) returns
|
Beta Was this translation helpful? Give feedback.
-
Yup. That's exactly it. I should probably escalate this to an official @beartype bug. There's not much we can do about the official |
Beta Was this translation helpful? Give feedback.
OMG!!! I can't believe this actually works... but it does. The CPython devs responsible for the standard
typing
module are very strict and typically try to lock down monkey-patching shenanigans like this. Looks like they forgot to prepare for @beartype: