Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature Request] Support for os.PathLike #314

Open
kaparoo opened this issue Dec 5, 2023 · 8 comments
Open

[Feature Request] Support for os.PathLike #314

kaparoo opened this issue Dec 5, 2023 · 8 comments

Comments

@kaparoo
Copy link

kaparoo commented Dec 5, 2023

TL;DR

How can I make this code work?

@beartype
def for_any_pathlike(str | os.PathLike[str]):
    ...

for_any_pathlike("awesome_text.txt")  # it works!
for_any_pathlike(pathlib.Path("useful_data.csv"))  # raises `beartype.roar.BeartypeDecorHintNonpepException` 

Content

There is a type alias StrPath defined in typeshed as

StrPath = str | os.PathLike[str]

Because the typeshed is not accessible at runtime, I tried to define StrPath myself and use it with @beartype like this:

from os import PathLike
from pathlib import Path

from beartype import beartype

StrPath = str | PathLike[str]

@beartype
def foo(path: StrPath) -> None:
    print(f"{path=!s}")

foo("test.py")  # "test.py"
foo(Path("test.py"))  # raises `beartype.roar.BeartypeDecorHintNonpepException`

The function foo worked as expected for "test.py" (string), but it raised beartype.roar.BeartypeDecorHintNonpepException for Path("test.py") which is evaluated - at least in mypy - as PathLike[str]:

beartype.roar.BeartypeDecorHintNonpepException: Function __main__.foo() parameter "path" type hint os.PathLike[str] either PEP-noncompliant or currently unsupported by @beartype.

I presume this is because os.PathLike uses the way below to provide an interface for Generic:

GenericAlias = type(list[int])

class PathLike(abc.ABC):
    ...
    __class_getitem__ = classmethod(GenericAlias)

Therefore I hope beartype supports ABCs like os.PathLike. Currently, I use StrPath = str | Path and there is no problem in 99% of my cases, but it would be better to base it on os.PathLike.

@leycec
Copy link
Member

leycec commented Dec 6, 2023

...heh. Oh, os.PathLike – the red-headed stepchild of the typing world. Officially, the semantics of os.PathLike[...] have yet to be standardized; since there's no existing standard, everyone just makes up their own interpretation of what subscripting os.PathLike actually means. Unofficially, mypy currently enforces the following ad-hoc constraints on os.PathLike subscriptions:

S = TypeVar('S', bound=Union[str, bytes])
class PathLike(Generic[S], extra=os.PathLike): # or Protocol[S] in future
    def __fspath__(self) -> S:
        ...

Given that, @beartype can effectively type-check a type hint os.PathLike[T] as follows:

  • An arbitrary object obj satisfies a type hint os.PathLike[T] if and only if:
isinstance(obj, os.PathLike) and
isinstance(obj.__fspath__(), T)

That is to say:

  1. obj must be an instance of os.PathLike (i.e., must define an __fspath__() dunder method).
  2. The value returned by calling the obj.__fspath__() dunder method must be an instance of the same type T subscripting the original type hint os.PathLike[T].

Complications Arise, Yo

Of course, there are complications. The os.PathLike generic is way too permissive at runtime. Notably, os.PathLike allows itself to be subscripted by arbitrary type hints that make no sense: e.g.,

>>> from os import PathLike
>>> PathLike[list[int]]
PathLike[list[int]]  # <-- uhh. wut? this makes no sense, Python. *facepalm*

Supporting os.PathLike subscriptions thus also requires @beartype to handle erroneous subscriptions that make no semantic sense. Indeed, I think there's only four valid parametrizations of os.PathLike:

  • os.PathLike[str], implying isinstance(obj, os.PathLike) and isinstance(obj.__fspath__(), str).
  • os.PathLike[bytes], implying isinstance(obj, os.PathLike) and isinstance(obj.__fspath__(), bytes).
  • os.PathLike[typing.AnyStr], which reduces to just os.PathLike, implying just isinstance(obj, os.PathLike). The child type hint typing.AnyStr is semantically meaningless.
  • os.PathLike[typing.Any], which reduces to just os.PathLike, implying just isinstance(obj, os.PathLike). The child type hint typing.Any is semantically meaningless. Unsurprisingly, this is basically a duplicate of the prior subscription.

That's it. I think? Still, that's pretty annoying. That's yet more bureaucratic minutiae that @beartype has to manually implement, because CPython refused to do its job here.

"Urgh!"*

Definitely Feasible, Bro

That's... definitely feasible. Rejoice, everyone. Sadly, I also kinda have no volunteer time to do this. It doesn't help that CPython itself fails to validate os.PathLike[...] subscriptions and just lets anybody subscript os.PathLike by literally anything. 🤦

For now, would you just mind dropping the subscription [str]? That is, would you mind just doing this:

StrPath = str | PathLike

That should absolutely work out-of-the-box without requiring any changes to @beartype itself.

Therefore I hope beartype supports ABCs like os.PathLike.

@beartype definitely supports ABCs like os.PathLike – and always has, thankfully. It's just subscripting ABCs with arbitrary type hints like os.PathLike[str] and os.PathLike[list[int]] where complications arise. None of that has been standardized. So, @beartype has to just make up its own semantics and pretend that it knows what it's talking about. </sigh>

@kaparoo
Copy link
Author

kaparoo commented Dec 7, 2023

Thank you for your kind comment @leycec!

For now, would you just mind dropping the subscription [str]? That is, would you mind just doing this:

StrPath = str | PathLike

The StrPath I want is either str or an object with a __fspath__ method that returns str, so it seems a little different from what you suggested. 😅

import os
from pathlib import Path
from beartype import beartype

StrPath = str | os.PathLike

@beartype
def foo(path: StrPath):
    print(os.fspath(path))

foo("test")  # it works (obviously)
foo(Path("test"))  # okay

class MyPath:
    def __init__(self, path):
        self.path = path

    def __fspath__(self):
        return self.path
    
foo(MyPath(b"test"))  # it also works...

So, I implemented a custom PathLike like mypy's issue:

from typing import TypeVar, Protocol, runtime_checkable

T = TypeVar("T", str, bytes)

@runtime_checkable
class PathLike(Protocol[T]):
    def __fspath__(self) -> T:
        ...

StrPath = str | PathLike[str]

@beartype
def foo(path: StrPath):
    print(os.fspath(path))

foo("test")
foo(Path("test"))

# foo(b"test")  # `foo` blocked `bytes` (finally!)

class BytePath:
    def __init__(self, path: str):
        self.path = bytes(path, encoding="UTF-8")

    def __fspath__(self) -> bytes:
        return self.path

foo(BytePath("test"))  # no..! it works!

I know that all methods of the @runtime_checkable() Protocol are checked only for existence.
Is there any way to solve this without extra validators in beartype?

@leycec
Copy link
Member

leycec commented Dec 8, 2023

Yuppers. The custom Protocol approach also totally works. Clever workaround there. Sadly...

Is there any way to solve this without extra validators in beartype?

Tragically, not at the moment. When @beartype fails to do what you want out-of-the-box, extra validators save you from @beartype's personal failings. Validators save you from us.

Thankfully, extra validators would allow you to both have your typing cake and eat it too. Since there are only two meaningful subscriptions of PathLike (i.e., PathLike[str] and PathLike[bytes]), you only need two beartype validators to make this magic happen. In fact, since you don't even care about PathLike[bytes], that's just one beartype validator. Pure static type-checkers like mypy and pyright will continue to support this as expected. Moreover, this approach avoids the need to define your own custom runtime Protocol subclass. It's simpler, faster, and simply magical.

Behold! PathLikeStr, arise like our Bengal cat from a crusty-eyed 14-hour slumber in our cat tree:

from beartype.vale import Is
from os import PathLike
from typing import TYPE_CHECKING, Annotated

# PathLike[str], but actually supported by everybody.
if TYPE_CHECKING:
    PathLikeStr = PathLike[str]
else:
    PathLikeStr = Annotated[PathLike, Is[
        lambda path_like: isinstance(path_like.__fspath__(), str)]]

# This is the way.
StrPath = str | PathLikeStr

Now, let's prove that actually does what @leycec promises that does:

# Import boring stuff.
>>> from beartype import beartype
>>> from os import fspath
>>> from pathlib import Path

# Prove that good always prevails.
>>> @beartype
... def foo(path: StrPath) -> str:
...     return fspath(path)
>>> foo("test")
'test'  # <-- good.
>>> foo(Path("more test"))
'more test'  # <-- GOOD.

# Now for the coup de grace.
>>> class BytePath(object):
...     def __init__(self, path: str):
...         self.path = bytes(path, encoding="UTF-8")
... 
...     def __fspath__(self) -> bytes:
...         return self.path
>>> foo(BytePath("ugly test"))
Traceback (most recent call last):  # <-- Suck it, "BytePath". Just suck it.
  File "/home/leycec/tmp/mopy.py", line 31, in <module>
    foo(BytePath("ugly test"))
  File "<@beartype(__main__.foo) at 0x7f3acf351c60>", line 32, in foo
beartype.roar.BeartypeCallHintParamViolation: Function __main__.foo() parameter path=<__main__.BytePath object at 0x7f3acec014c0> violates type hint typing.Union[str, typing.Annotated[os.PathLike, Is[lambda path_like: isinstance(path_like.__fspath__(), str)]]], as <class "__main__.BytePath"> <__main__.BytePath object at 0x7f3acec014c0>:
* Not str.
* <class "__main__.BytePath"> <__main__.BytePath object at 0x7f3acec014c0> violates validator Is[lambda path_like: isinstance(path_like.__fspath__(), str)]:
    False == Is[lambda path_like: isinstance(path_like.__fspath__(), str)].

Typing cake: it tastes delicious. 😋

@kaparoo
Copy link
Author

kaparoo commented Dec 8, 2023

Oh... this is the best way in the current situation...
Okay. I'll take this 🥲
Thanks @leycec ! 🙇

@kaparoo kaparoo closed this as completed Dec 8, 2023
@leycec
Copy link
Member

leycec commented Dec 8, 2023

You're most welcome. As a gentle reminder to myself to eventually implement this properly for everybody, would you mind if I quietly reopen this feature request? With any luck, I'll tackle stuff like this in 2024. Let the hype train begin! 🎆

@leycec leycec reopened this Dec 8, 2023
@kaparoo
Copy link
Author

kaparoo commented Dec 8, 2023

Sure! Thank you for your attention! 🙏

leycec added a commit that referenced this issue Dec 8, 2023
This commit is the first in a commit chain adding support for
**unrecognized subscripted builtin type hints** (i.e., C-based type
hints that are *not* isinstanceable types, instantiated by subscripting
pure-Python origin classes subclassing the C-based `types.GenericAlias`
superclass such that those classes are unrecognized by @beartype and
thus *not* type-checkable as is), en-route to partially resolving both
feature requests #219 kindly submitted by extreme Google X guru
@patrick-kidger (Patrick Kidger) *and* #314 kindly submitted by adorable
black hat ML fiend @kaparoo (Jaewoo Park). Specifically, this commit
adds support for:

* Internally detecting such hints.
* Internally reducing such hints to their unsubscripted origin classes
  (which are almost always pure-Python isinstanceable types and thus
  type-checkable as is).

Naturally, nothing is tested; everything is suspicious.
(*Really boring jam in a boreal jamboree!*)
leycec added a commit that referenced this issue Dec 14, 2023
This commit is the next in a commit chain adding support for
**unrecognized subscripted builtin type hints** (i.e., C-based type
hints that are *not* isinstanceable types, instantiated by subscripting
pure-Python origin classes subclassing the C-based `types.GenericAlias`
superclass such that those classes are unrecognized by @beartype and
thus *not* type-checkable as is), partially resolving both feature
requests #219 kindly submitted by extreme Google X guru @patrick-kidger
(Patrick Kidger) *and* #314 kindly submitted by adorable black hat ML
fiend @kaparoo (Jaewoo Park). Specifically, this commit finalizes
working support for shallowly type-checking PEP-noncompliant type hints
bundled with the `typeshed` -- including:

* `os.PathLike[...]` type hints.
* `weakref.weakref[...]` type hints.

This commit also extensively tests that @beartype now shallowly
type-checks `os.PathLike[...]` type hints. (*Unitary Unitarians are IT units!*)
@leycec
Copy link
Member

leycec commented Dec 14, 2023

Partially resolved by fe5b23d. @beartype's upcoming 0.17.0 release ...to be released before Santa Bear Claws gives us all the video games shallowly type-checks os.PathLike[...] type hints.

In fact, @beartype 0.17.0 shallowly type-checks all weirdo non-standard type hints in Python's typeshed. This includes os.PathLike[...], weakref.weakref[...], and even more hideous monsters as yet unknown to man and beaver alike. @beartype 0.17.0 no longer raises exceptions when confronted with this sort of typing gore.

This means that your above verbose definition of PathLikeStr can be reduced to this terse one-liner:

from beartype.vale import Is
from os import PathLike
from typing import Annotated

# PathLike[str], but actually supported by everybody.
PathLikeStr = Annotated[PathLike[str], Is[  # <-- much simpler, much terser, much gooder
    lambda path_like: isinstance(path_like.__fspath__(), str)]]

# This is the way.
StrPath = str | PathLikeStr

Rejoice! Santa Bear Claws has heard your prayers. 🎅

@kaparoo
Copy link
Author

kaparoo commented Dec 15, 2023

Brilliant! I miss Santa already 🎅

leycec added a commit that referenced this issue Dec 16, 2023
This commit is the next in a commit chain adding support for
**unrecognized subscripted builtin type hints** (i.e., C-based type
hints that are *not* isinstanceable types, instantiated by subscripting
pure-Python origin classes subclassing the C-based `types.GenericAlias`
superclass such that those classes are unrecognized by @beartype and
thus *not* type-checkable as is), partially resolving both feature
requests #219 kindly submitted by extreme Google X guru @patrick-kidger
(Patrick Kidger) *and* #314 kindly submitted by adorable black hat ML
fiend @kaparoo (Jaewoo Park). Specifically, this commit extensively
tests that @beartype now shallowly type-checks `weakref.ref[...]` type
hints. (*Sanguinary exsanguination extinguishes a distinguished canary!*)
leycec added a commit that referenced this issue Dec 16, 2023
This commit is the next in a commit chain adding support for
**unrecognized subscripted builtin type hints** (i.e., C-based type
hints that are *not* isinstanceable types, instantiated by subscripting
pure-Python origin classes subclassing the C-based `types.GenericAlias`
superclass such that those classes are unrecognized by @beartype and
thus *not* type-checkable as is), partially resolving both feature
requests #219 kindly submitted by extreme Google X guru @patrick-kidger
(Patrick Kidger) *and* #314 kindly submitted by adorable black hat ML
fiend @kaparoo (Jaewoo Park). Specifically, this commit resolves a
Python 3.9-specific issue concerning `weakref.ref[...]` type hints.
Notably, the `weakref.ref` class improperly advertises itself as a
builtin type under *only* Python 3.9. @beartype now correctly detects
and circumvents this misadvertisement. (*Considerable concern sidles idly by Admiral Admirable!*)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants