Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature Request] Support typing.get_overloads() under Python ≥ 3.11 #54

Open
ruancomelli opened this issue Oct 5, 2021 · 27 comments

Comments

@ruancomelli
Copy link

Greetings

Hi, dear weird bear aficionado! First of all, thanks for this awesome library! I've just started using it and it looks great, I can't wait to write runtime-type-safe functions everywhere.

What I would like to see

One functionality that I would love to see here is the ability to write overloaded functions, akin to typing.overload, but at runtime. Something like this:

from beartype import overload

@overload
def greet(name: str, age: int) -> None: ...

@greet.overload
def greet(age: int) -> None: ...

@greet.implement
def greet(name_or_age, age=None):
    if age is None:
        age = name_or_age

        if age > 20:
            print('Hello, unknown person! You are allowed to enter.')
        else:
            print('Go back home, unknown child!')
    else:
        name = name_or_age

        if age > 20:
            print(f'Hello, {name}! You are allowed to enter.')
        else:
            print(f'Go back home, {name}!')

greet('Ruan', 24) # prints "Hello, Ruan! You are allowed to enter."
greet(24) # prints "Hello, unknown person! You are allowed to enter."
greet('Ruan') # ROOOOAARRR!!! - because there are no overloads matching the signature "greet(str)"

Note that this is not equivalent to writing greet(name_or_age: Union[str, int], age: Optional[int]) since, for instance, greet(10, 10) is invalid.

Basically, beartype should test function calls against all of the overloaded signatures. If any of them matches, it's okay. Otherwise, ROAR!

Also note that here I separated the @overload part from the @implement one. This is because...

What I am not proposing

I am not proposing function dispatching here. So the following is not what I wish to see:

from beartype import dispatch

@dispatch
def greet(name: str, age: int) -> None:
    if age > 20:
        print(f'Hello, {name}! You are allowed to enter.')
    else:
        print(f'Go back home, {name}!')

@dispatch
def greet(age: int) -> None:
    if age > 20:
        print('Hello, unknown person! You are allowed to enter.')
    else:
        print('Go back home, unknown child!')

Function dispatching brings a lot of issues such as deciding which implementation to choose for a given signature. And also, there are already some libraries out there that implement single and even multiple dispatching, I don't believe that that is a job for beartype.

So...

That is why I separated the @overload and the @implement parts. The way I see it, beartype should first check for the first overload, and then the second, the third and so on until one matching overload is found. If this is the case, then beartype should just call the @implement part without any further checks.

Does that make sense to you? Looking forward to seeing your opinion on this!

@leycec
Copy link
Member

leycec commented Oct 6, 2021

Hi, dear weird bear aficionado!

In Canada, such individuals are commonly referred to as... Canadians. badum ching sound effect

But I jest. Hi, masterful Brazilian tech-lead maestro! Let's do everything we can for your crew and this delightfully thought-provoking feature request. 🇧🇷 🌴 🇧🇷

Also, did I mention your English is impeccable? Because it is. I can barely spell-check most words in my native language and you're over here rocking English, Brazilian-Portuguese, Python, and academic jargon (which is like it's own horrifying language). Le sigh.

One functionality that I would love to see here is the ability to write overloaded functions, akin to typing.overload, but at runtime.

Ah-ha! Type-checking bears everywhere would likewise love to see this feature. As you slyly insinuate, typing.overload itself is hostile to our runtime type-checking interests:

The @overload-decorated definitions are for the benefit of the type checker only... [and are] used at runtime but should be ignored by a [runtime] type checker.

wut

Thus your runtime-friendly syntax – which, like your English, is impeccable beyond all reasoning. I'm so appreciative of the time you invested in this. It's like a full-blown PEP, but authored just for me. I can feel the love for this topic radiating from here.

Big Issue Is Big

There's only one issue and it is a Big Issue™:

# mypy loves this! Bless you, mypy.
mypy -c "
from typing import overload

@overload
def roar(panic_button: int) -> int: ...
@overload
def roar(panic_button: float) -> float: ...
def roar(panic_button):
    return panic_button"
Success: no issues found in 1 source file

# mypy hates this. Curse you, mypy!
$ mypy -c "
def beartype_overload(func): return func

@beartype_overload
def roar(panic_button: int) -> int: ...
@beartype_overload
def roar(panic_button: float) -> float: ...
def roar(panic_button):
    return panic_button"
<string>:3: error: Name "roar" already defined on line 4
<string>:4: error: Name "roar" already defined on line 4
Found 2 errors in 1 file (checked 1 source file)

The Big Issue™ is mypy hates redefinition of callables not explicitly decorated by the @typing.overload decorator. Because mypy disfavours type-checking bears on general principle, mypy doesn't care that we've defined our own @beartype_overload decorator that's soooo much cooler, wiser, and older than the official @typing.overload decorator.

There's no way for us to inform mypy that our approach is better than the official approach. So, mypy being mypy, mypy just unconditionally flags everything as bad, throws up a nuclear hellstorm of red error messages, and fails with non-zero exit status. 💢

Mypy Is Not Why I Came Here

You are now wondering: "Bro, why are we even talking about mypy? I ate my pie for breakfast!"

Yes, yes. We all hate mypy. ...just kidding, mypy devs? For better or worse, mypy is still the definitive type checker. It's word is the iron law that we all type-check by – even moreso than the official PEP standards that mypy implements.

If we violate mypy, we violate PEP standards. In this case, we violate both PEP 484 (which standardizes callable overloading) and PEP 561 (which equates mypy conformance with py.typed files). That would be bad, because...

If we violate PEP standards, the fabric of quality assurance itself comes crashing down around our soothing mechanical keyboards and we stand dumb-founded as the smart money quietly abandons beartype for competing constant-time runtime type-checkers that actually preserve PEP standards.

So we don't violate mypy.

Then There Is No Hope for Us

Ah-ha! Fear not, Master Commelli. All is not well – but all is not lost, either. Thanks to the hideous power of monkey-patching, we can preserve compatibility with both mypy and existing PEP standards while still getting our cake and eating it, too.

Specifically, we can "improve" our beartype.__init__ submodule (implicitly called on the first external importation of the beartype package) to dynamically replace the useless but safe @typing.overload decorator with our own useful but dangerous beartype-specific @beartype.overload decorator.

Can we implement @beartype.overload to do what everyone wants? I think so. In my balding head, I am vaguely envisioning logic that resembles:

def beartype_overload(func: T) -> T:
    # Dictionary mapping from the name to value of all
    # attributes declared in the global scope of the passed
    # callable.
    global_name_to_value = func.__globals__

    # Most recently declared callable overloading the
    # passed callable if any *OR* "None" otherwise.
    func_overloaded = global_name_to_value.get(func.__name__)

    # List of all callables overloading the passed callable
    # if any *OR* "None" otherwise.
    funcs_overloaded = func_overloaded.get(
        '__beartype_funcs_overloaded')

    # If the passed callable has *not* already been declared in
    # the current scope, attach a new private beartype-specific
    # attribute to this callable recording all overloaded
    # alternatives of this callable to be subsequently declared.
    if funcs_overloaded is None:
        func.__beartype_funcs_overloaded = [func,]
    # Else, the passed callable has already been declared in
    # the current scope. In this case, append the currently
    # overloaded alternative of this callable to the existing
    # private beartype-specific attribute recording overloads.
    else:
        func.__beartype_funcs_overloaded = (
            funcs_overloaded + [func,])

    # Return the currently overloaded alternative as is.
    return func

# Monkey-patch us up the bomb. Suck it, mypy!
import typing
typing.overload = beartype_overload

So, we have now (possibly successfully) defined an ad-hoc decorator recording all overloaded alternatives of any arbitrary callable. We store each callable in entirety, as the @beartype decorator will subsequently generate code type-checking these alternatives by dynamically inspecting:

  • The overloaded_func.__annotations__ dictionary (mapping all annotated callable parameters and returns to their type hints) of each overloaded alternative in the func.__beartype_overloaded_callables list.
  • The overloaded_func.__code__ object (providing the code object, providing the number and kinds of passed arguments) of each overloaded alternative in the func.__beartype_overloaded_callables list.

Given that, we then refactor the @beartype decorator to:

  1. Detect that a callable has been overloaded by detecting whether that callable defines the __beartype_overloaded_callables attribute.
  2. If so, generate code type-checking that parameters and returns satisfy at least one overloaded alternative. As you smartly suggest, iterative testing in declaration order is almost certainly the Right Way™ to do this: "The way I see it, beartype should first check for the first overload, and then the second, the third and so on until one matching overload is found." Yes.
  3. Else, do what we currently do.

...I Don't Feel So Good Anymore

Right there with you, bro.

On the bright side, none of this is infeasible. It's all feasible and possibly even fun! On the dark side, all of this will consume my precious life force that might better be redirected towards lower-hanging and less dangerous fruit like deep PEP 484 and 585 support.

What I'm saying is: "By the power of the Brazilian rain forest gods, someone do this for us by submitting a working PR." I will merge anything that passes tests... anything.

Until then, thanks again for your spellbinding dive into callable overloading, Ruan! We'll eventually get this done for everyone with risky behaviour like monkey-patching the core typing API. Not even mypy will be able to complain then about our dark handiwork, because not even mypy will know. ...chuckles mirthlessly

@TeamSpen210
Copy link

Unfortunately that wouldn't work, since the func passed into the decorator is the new unassigned object, you won't be given the old function. The undecorated implementation would overwrite the decorated ones anyway after there. I think you'd need to require the implementation be annotated by @beartype, then in both decorators check sys._getframe() to retrieve the locals/globals, and find the currently assigned function. Accessing the func.__globals__ would work, but not for locally defined functions, methods etc.

overload would create/append to a list as described, then @beartype would look for one to see if the function is overloaded.

@leycec
Copy link
Member

leycec commented Oct 10, 2021

Bwa-hah! Team Spen 2010 is, of course, absolutely right.

The above @beartype_overload decorator was only intended as untested first-stab pseudo-code and unsurprisingly broken for obvious reasons. For all you armchair beartypers following along in your cozy bear caves, here's the real deal that should work for non-nested callables:

def beartype_overload(func: T) -> T:
    # Dictionary mapping from the name to value of all
    # attributes declared in the global scope of the passed
    # callable.
    global_name_to_value = func.__globals__

    # Most recently declared callable overloading the
    # passed callable if any *OR* "None" otherwise.
    func_overloaded = global_name_to_value.get(func.__name__)

    # List of all callables overloading the passed callable
    # if any *OR* "None" otherwise.
    funcs_overloaded = func_overloaded.get(
        '__beartype_funcs_overloaded')

    # If the passed callable has *not* already been declared in
    # the current scope, attach a new private beartype-specific
    # attribute to this callable recording all overloaded
    # alternatives of this callable to be subsequently declared.
    if funcs_overloaded is None:
        func.__beartype_funcs_overloaded = [func,]
    # Else, the passed callable has already been declared in
    # the current scope. In this case, append the currently
    # overloaded alternative of this callable to the existing
    # private beartype-specific attribute recording overloads.
    else:
        func.__beartype_funcs_overloaded = (
            funcs_overloaded + [func,])

    # Return the currently overloaded alternative as is.
    return func

#FIXME: This is unsafe, thanks to order-of-importation issues.
#Specifically, this fails when the first external user module
#importing "beartype" imports "typing.overload" first. The
#GLaDOS-resembling terror AI @TeamSpen210 has an equally
#clever and frightening solution: do this, but also monkey-patch
#the original @typing.overload callable by replacing its internal
#code object (i.e., "typing.overload.__code__") with that of our
#@beartype_overload callable. In theory, that should work; in
#practice, we tread on thin ice and we just saw a crack widen.

# Monkey-patch us up the bomb. Suck it, mypy!
import typing
typing.overload = beartype_overload

Again, exactly as you say, even that generalization fails for nested callables. Thankfully, we can account for that as well by deferring to our previously defined and exhaustively tested beartype._util.func.utilfuncscope.get_func_locals() getter.

To my knowledge, that getter is the most robust algorithm for retrieving the set of all locals of a possibly deeply nested callable at decoration time. I'm pretty sure no one's done better. It has to be robust, because Python 3.10 unconditionally enabled PEP 563. If it wasn't robust, @beartype itself wouldn't work at all under Python ≥ 3.10 – but @beartype does! Yay.

So... this is all still feasible in practice, maybe? If even all the above generalizations still fail, I'm afraid @beartype simply won't be able to support function overloading – which would really suck, 'cause function overloading is all the hotness in stub files (and increasingly non-stub files, too).

Cue hypothetical sad cat. 😿

@TeamSpen210
Copy link

Ah yes, you already have to deal with that for stringified annotations. (3.10 actually reverted defaulting to that, so a solution could be found that works better for runtime typing.) Instead of overriding the entire fucntion, it might also be good to transfer the code object across. Then copies of the function imported before beartype will also be updated, and either order of imports would just work.

@leycec
Copy link
Member

leycec commented Oct 10, 2021

3.10 actually reverted defaulting to that, so a solution could be found that works better for runtime typing.

Wait. Rly? Like, srsly? I mean, that's great. I'll need to immediately revert all of the assumptions @beartype makes about Python 3.10. But that's fine – more than fine, even.

Because PEP 563 is fundamentally hostile to runtime type-checking. Sure, it doubles the speed of static type-checking – but who cares about a mere doubling of an edge case? You don't break all Python parsers everywhere for a mere doubling that only benefits a vocal minority while simultaneously reducing everyone else's runtime performance.

That said... I now see that PEP 563 was merely delayed until Python 3.11, which remains scheduled for rapid release next year. At most, this buys the runtime type-checking community a brief reprieve.

That said... I also now see that PEP 649 supersedes PEP 563 with a slightly saner solution. Because @beartype already implemented an algorithm for perfect resolution of deferred annotations, we don't actually benefit here. Resolving deferred annotations will be just as expensive under PEP 649 as it was under PEP 563.

Either way, CPython devs are still breaking all Python parsers everywhere for a mere doubling that only benefits a vocal minority while simultaneously reducing everyone else's runtime performance. They're only waiting another year to do it. </sigh>

thus concludes another vacuous rant of the Type-checking Bear Conclave

@TeamSpen210
Copy link

The idea is that it gives time to figure out a better solution, perhaps one of those PEPs or something entirely different - I guess check mailing lists and chime in?

@leycec
Copy link
Member

leycec commented Oct 10, 2021

...I know, I know. I should do that. Gods, why are you always right!

This is like participatory democracy all over again: if you don't participate, you don't get the right to complain. And I want that right.

I love complaining. The innermost fire that burns with every internet rant is how I warm myself on frigid Canadian nights when the pellet stove has burned its last pellet, the cats are shivering unconsolably in their roosts, and the night wind howls like a thousand forlorn lynxes across the chittering lake of ice outside our crumbling doorstep.

My personal read on the situation is that the only reason we're delaying the PEP 563 (and maybe now PEP 649) rollout is that it upsets the huge, highly influential, and highly profitable FastAPI community, which leverages pydantic for its runtime validation. That's a good reason, but it should have never come to that. PEP 563 should have never passed the peer review process, but it did – which means the peer review process seems to be broken here.

PEP 649 is now being presented as the "middle ground," but my position is rather more pragmatic: both PEP 563 and 649 are bad and doing nothing is better than doing something bad. Kinda doubt Guido wants to hear that.

I guess what I'm saying is... I hate shouting into the wind. 🌬️

@leycec
Copy link
Member

leycec commented Oct 10, 2021

Instead of overriding the entire fucntion, it might also be good to transfer the code object across. Then copies of the function imported before beartype will also be updated, and either order of imports would just work.

OMFG. Gods! You are always right. It's like you're Nega-@leycec, my nefarious arch-nemesis rival from a dismal adjacent hyperplane managed by the Aperture Science Innovators-funded GLaDOS.

You... you're not GLaDOS, are you? I mean, GLaDOS probably wouldn't admit to that. But you kinda seem omnipotent, which is scaring the cats over here a little bit. 🤖

I never even knew you could monkey-patch a callable by quietly replacing its code object while no one was looking, preserving the superficial shell of the original callable like an alien xenomorph implanted into the turgid belly of an unsuspecting Nostradamus miner.

That's wicked, bro. I love it. Pragmatically, I have no idea how to safely do that – but where this is a devious will, there is a devious way. I've updated the last code snippet with a FIXME: comment to that effect.

@TeamSpen210
Copy link

I'm not, no, just someone who likes looking at internals. If you do do it though, it will still use typing as the global namespace, so you'll need to add any additional globals there as well. It's pretty straightforward (no idea what PyPy will do though):

def beartype_overload(func):
    ...

overload.__code__ = beartype_overload.__code__

It might be desirable to do some checks to ensure another module hasn't overridden the method, and warn in that case - maybe revert to normally monkey patching, and call the overridden version at the end so hopefully they chain together.

@leycec
Copy link
Member

leycec commented Oct 10, 2021

...so you'll need to add any additional globals there as well.

Right-o. As well as patch up __defaults__, if we end up microoptimizing via that route. Yup! We actually do this already. It's the musty skeleton in the closet we promised never to talk about on GitHub. Wrapper functions generated by @beartype maliciously abuse __defaults__ to avoid inefficient global lookups. kek kek kek

...no idea what PyPy will do though

Yikes. Will the caller module still get the previously jitted typing.overload code object (nullifying our body-snatching efforts) or does PyPy actually encapsulate its jitting into code objects themselves (enabling and abetting our body-snatching efforts)? Tune in next week for...

Disastrous Monkey-patches of the Young & Dangerous.

It might be desirable to do some checks to ensure another module hasn't overridden the method, and warn in that case...

You even thought of that. I thought we subconsciously agreed not to publicly talk about that, because everything just gets so ugly so fast.

But... yes. We can't be the only back-alley cretins contemplating this. Somewhere in a locked vault secure under the Alaska-British Columbian border, someone with Roswell-level security clearance of is already doing this. And they're not going to be pleased when they read this issue.

...maybe revert to normally monkey patching...

You're right, of course. Replacing the @typing.overload code object is fine, because that decorator just destroys everything it touches anyway. Replacing someone else's code object is less fine, because they're probably trying to actually get some meaningful work done.

So, we can't just replace code objects in that case. Instead, we reach for the oxygen bag and hope for the best.

...and call the overridden version at the end so hopefully they chain together

hopefully

an adverb i never hoped to hear again

You're right, of course. Let's pretend there's a sane plugin API here and just blindly chain everything together. Nothing could possibly go wrong.

cue GLaDOS

@leycec
Copy link
Member

leycec commented Oct 10, 2021

Lastly, I have a painfully stupid alternative to everything clever you devised above. Rather than abuse code objects, just:

  1. Search up the call stack for the external third-party module directly importing beartype.
  2. Abuse the dictionary (i.e., __dict__) of that module if needed. Namely:
    1. Detect whether @typing.overload (i.e., a key-value pair with name overload and value typing.overload) exists in that dictionary.
    2. If so, replace that key-value pair with a key of the same name whose value is beartype_overload.

Machine Gods, spare us from my dumbness! Please tell me that will never work or only occasionally work but fall down in pernicious edge cases.

The most significant edge case I see is conditional deferred imports of either beartype or typing.overload nested inside something else and thus not at global scope. In the former case, we can't reliably detect the third-party module directly importing beartype (or can we, via inspection of call stack locals?); in the latter case, we can't reliably replace the local typing.overload import (or can we, via modification of call stack locals?).

@TeamSpen210
Copy link

It wouldn't work if you import it no-globally, just import typing, use it from elsewhere, etc.

@leycec
Copy link
Member

leycec commented Oct 10, 2021

I didn't even think of the obvious bare import typing case. </sigh>

Very well. We congenially agree as gentlemen that this is ugly, then.

@leycec
Copy link
Member

leycec commented Oct 10, 2021

Ah-ha! The obvious bare import typing case isn't actually a problem, because in that case our trivial typing.overload = beartype_overload monkey-patch will still apply.

Using it from elsewhere: ditto... probably.

I'm mostly trying to cover the obscure corner case of: "What do we do when someone else already monkey-patched typing.overload before we even got there?" In that case, delicately reaching up into the user's namespace kinda seems like the least-bad option on this Roulette Wheel of Hell.

If we can nail down 99% of all edge cases, I'm mostly fine with just documenting that:

You must import beartype before anything else for perfect @typing.overload support. If you can't or won't do this for justifiable reasons known only to you, beartype will still provide imperfect @typing.overload support safely covering 99% of all use cases.

Happily, importing beartype before anything else is exactly what most codebases will do anyway once we drop support for automagical beartype import hooks with beartype 0.10.0.

Oh, and thanks much-ly for hashing this over with me this Saturday evening. You were a tremendous help, AI pal @TeamSpen210. Truly. Thank you.

@ruancomelli
Copy link
Author

Oh wow, the conversation here became too complex for me to follow properly. Sorry for the delay, I was hibernating (and by that I mean being buried in not-so-exciting stuff from my studies). And sorry if the following will sound stupid...it's your fault actually, I didn't even ask for such an ingenious discussion, but can I add my two cents here?

My humble opinion is that beartype_overload should not even try to replace typing.overload, or mess up with their __code__ nor anything else. There is already too much magic going on here. Remember that we are concerned with type-checking functions at runtime only - static type checking is mypy's business. Of course, we don't want to break anything - mypy is our friend - but we can try and let it do its job. Keeping this in mind, why not (and please don't spit at me if this sounds ridiculous) compose beartype_overload with typing.overload? Something like this:

from typing import overload
from beartype import beartype, beartype_overload

@overload
@beartype_overload
def stringify(x: int) -> str:
    ...

@overload
@beartype_overload
def stringify(x: float) -> str:
    ...

@beartype
def stringify(x: int | float) -> str:
    return str(x)

Mypy seems to be happy with this, and I even get pretty code completion. The most important thing here is that beartype_overload returns the exact same function type that it receives as input - that is, we need something like

from typing import Any, Callable, TypeVar

CallableT = TypeVar('CallableT', bound=Callable[..., Any])

def beartype_overload(func: CallableT) -> CallableT:
    ... # <clever implementation here>

If I dare to write something so seemingly innocent as def beartype_overload(func: Callable) -> Callable: ... (note that I replaced CallableT with Callable), code completion breaks afterward because we are not necessarily returning a function with the exact same signature as the decorated one.

Here goes a sample implementation:

from collections import defaultdict
from inspect import signature
from pprint import pprint
from typing import Any, Callable, TypeVar, overload

AnyCallable = Callable[..., Any]
CallableT = TypeVar("CallableT", bound=AnyCallable)

RECORD: defaultdict[str, list[AnyCallable]] = defaultdict(list)

def beartype_overload(func: CallableT) -> CallableT:
    RECORD[func.__name__].append(func)
    return func

def beartype(func: AnyCallable) -> AnyCallable:
    pprint([signature(f) for f in RECORD[func.__name__]])
    return func

# ------------------------------------------------ USAGE:
@overload
@beartype_overload
def stringify(x: int) -> str:
    ...

@overload
@beartype_overload
def stringify(x: float) -> str:
    ...

@beartype
def stringify(x: int | float) -> str:
    return str(x)

A summary of what is going on here:

  • beartype_overload simply records function signatures in a table. This should be replaced with your function (which, by the way, doesn't work here - and I don't understand the internals well enough to fix it);
  • beartype just prints the overload signatures found so far. It should be replaced with a modified version of the beartype function containing the new-super-blaster-awesome overload checking functionality that you suggested in previous chapters:
    1. Detect that a callable has been overloaded by detecting whether that callable defines the __beartype_overloaded_callables attribute.
    2. If so, generate code type-checking that parameters and returns satisfy at least one overloaded alternative. As you smartly suggest, iterative testing in declaration order is almost certainly the Right Way™ to do this: "The way I see it, beartype should first check for the first overload, and then the second, the third and so on until one matching overload is found." Yes.
    3. Else, do what we currently do.

The main advantage I see here is that there are no conflicts with typing or even third-party libraries that monkey-patch typing.overload. The main downside is that users are required to add an extra @overload on top of their function overloads, but I can live with that - in fact, as a user, I even prefer having this extra burden rather than facing hard-to-understand errors in the future just because another third-party lib monkey-patched typing.overload after beartype, breaking everything and unleashing unstoppable chaos.

Does this make sense to you people?

@leycec
Copy link
Member

leycec commented Oct 14, 2021

yeah, well, you know, like

...is what I'd say if I was a jerk. Instead, I'm Canadian. Like all patriotic frost-bite victims, I'm required by federal law to be congenial, punctual, and polite. Apparently, this is an alternative to actually having nuclear weapons. i remain skeptical

Srsly, tho. You are absolutely right about everything. Decorator composability is the viable third way we shamefully failed to consider – even though it's also the least fragile and most explicit approach. I am now self-flagellating myself with a burlap handbag.

I'm deeply indebted for all of the voluminous code, too. I can personally confirm that:

  • mypy is, indeed, smart enough to permit third-party decoration on @typing.overload-decorated callables. Thank Guido!
  • Your IDE-friendly bound type variable annotations work exactly as intended, because that's what @beartype itself is annotated with and for exactly the same reason: i.e.,
# In "beartype._decor.main":
T = TypeVar('T', bound=Callable[..., Any])
def beartype(func: T) -> T: ...
  • Your sample implementation is exactly what I've been cogitating in my crusty brainpan all day. Brilliant minds, bro.

Quick – Get the Toilet Snake, Somebody

The Turing-complete devil is in the details.

The most significant blocker is still the @typing.overload decorator itself. When I suggested that decorator "just destroys everything it touches," I wasn't speaking metaphorically:

# In the official "typing" module:
def _overload_dummy(*args, **kwds):
    raise NotImplementedError(
        "You should not call an overloaded function. "
        "A series of @overload-decorated functions "
        "outside a stub module should always be followed "
        "by an implementation that is not @overload-ed.")

def overload(func):
    return _overload_dummy

Let's take a brief moment to appreciate the anal-retentive banality in the hearts of men.

typing authors could and should have just implemented @typing.overload as the trivial idempotent identity decorator (e.g., as def overload(func): return func). Instead, they intentionally forced @typing.overload to unconditionally replace (and thus destroy) the passed callable with a useless placeholder that just raises a useless exception.

serenity please

This means that order is now significant. Specifically, everything will silently blow up if users accidentally reverse the decorator order: e.g.,

from typing import overload
from beartype import beartype, beartype_overload

# This is like "Where's Waldo?" all over again.
# Can you spot the error? We can, because
# we have peered into the dark abyss that is
# the "typing" module. And it has stared back. 
@beartype_overload
@overload
def stringify(x: int) -> str:
    ...

@beartype_overload
@overload
def stringify(x: float) -> str:
    ...

@beartype
def stringify(x: int | float) -> str:
    return str(x)

The naïve implementation of @beartype_overload innocently accepts the amputee typing._dummy_function returned by the amputator @typing.overload as a valid overload, when in fact that dummy function signifies nothing except that typing authors probably need to find more productive uses for their time.

The smart solution is for @beartype_overload to explicitly detect this common edge case and raise an explanatory fatal exception. Again, the naïve implementation of that is to violate privacy encapsulation by directly accessing the private typing._dummy_function attribute. Again, the smart solution to that is to avoid violating privacy encapsulation by only indirectly accessing that attribute through the public @typing.overload destroyer: e.g.,

# In a new hypothetical "beartype._overload" submodule:
from typing import overload

# This is insane. This is typing.
_typing_dummy_function = overload(lambda: ...)

def beartype_overload(func: CallableT) -> CallableT:
    if func is _typing_dummy_function:
        raise BeartypeOverloadOrderException(
            '@beartype_overload erroneously applied before @typing.overload. '
            'Please apply @beartype_overload after @typing.overload instead: e.g.,\n'
            '    @overload\n'
            '    @beartype_overload\n'
            '    def my_overloaded_func(): ...'
        )

    RECORD[func.__name__].append(func)
    return func

Great. We have murdered a critically endangered dragon in its own lair. I hope we feel happy with ourselves.

The Old & The Balding: This Is My Story

Similar issues arise with the @beartype decorator itself.

Since overloaded callables must now be non-obviously decorated with multiple decorators in a specific order, @beartype also needs to explicitly detect various edge cases and raise explanatory fatal exceptions. Interestingly, there actually exist two edge cases here – one common and one that will never actually happen but nevertheless serves as a tedious masochism exercise useful thought experiment:

  • The common case, in which a user never even applies our ad-hoc @beartype_overload decorator. This may happen either:
    • Because they never knew that was even a thing they had to do or...
    • Because their working code used to work just fine on a prior version of beartype but now this shiny but awful new version of beartype is raising incomprehensible exceptions about @beartype_overload and they're kinda wondering if they should just revert to the prior version of beartype where everything really worked better:
from typing import overload
from beartype import beartype

# Oh, you sweet summer child.
@overload
def stringify(x: int) -> str:
    ...

@overload
def stringify(x: float) -> str:
    ...

@beartype
def stringify(x: int | float) -> str:
    return str(x)
  • The unlikely case, in which a user applies our ad-hoc @beartype_overload decorator to every overload except the last – probably because they accidentally forgot and not at all because they're the literal reincarnation of Python Chucky:
from typing import overload
from beartype import beartype, beartype_overload

# ...good, good.
@overload
@beartype_overload
def stringify(x: int) -> str:
    ...

# ...what is this horrible thing you have done.
@overload
def stringify(x: float) -> str:
    ...

@beartype
def stringify(x: int | float) -> str:
    return str(x)

Of course, we can reliably detect both cases:

def beartype(func: CallableT) -> CallableT:
    # Detect both cases and then...
    if func.__globals__.get(func.__name__) is _typing_dummy_function:
        # If this is the uncommon case, cry @beartype a river.
        if func.__name__ in RECORD:
            raise BeartypeOverloadOrderException(
                f'Last overload of @typing.overload-decorated callable {func.__name__}() '
                f'not also decorated by @beartype.beartype_overload. '
                'Please apply @beartype_overload after @typing.overload: e.g.,\n'
                '    @overload\n'
                '    @beartype_overload\n'
                '    def {func.__name__}(): ...'
            )
        # Else, this is the common case. Cry until it stops hurting.
        else:
            raise BeartypeOverloadOrderException(
                f'@typing.overload-decorated callable {func.__name__}() '
                f'not also decorated by @beartype.beartype_overload. '
                'Please apply @beartype_overload after @typing.overload: e.g.,\n'
                '    @overload\n'
                '    @beartype_overload\n'
                '    def {func.__name__}(): ...'
            )

    pprint([signature(f) for f in RECORD[func.__name__]])
    return func

Of course, the kinda bigger catastrophe is that we're now breaking backward compatibility.

Of course, we're not beartype 1.0.0. Some breakage was inevitable. The dam was going to burst and flood that quaint mountain hamlet full of boutique chew toy retailers... eventually. This may be that burstage. 🦴

Of course, the even bigger catastrophe is that @beartype is no longer "batteries included." Previously, @beartype just worked. You could throw anything PEP-compliant at @beartype and it would shrug menacingly, enter Keanu Reeves Bullet Time, and silently eat your quintuply-nested list of 1,000,000 complex numbers in constant time without even so much as a cliche "woah dude" one-liner.

Now, @beartype no longer just works. You now need to go through quite a few extra hurdles (including copy-pasting all this boilerplate everywhere) to get @beartype to work. The happy-go-lucky adverb "just" no longer applies.

Stalin preserve us. We've become the self-loathing bureacratic paper-shuffler in Papers, Please.

Wake Me up When the Pain Has Finally Subsided

There's also a crazy number of wide-eyed gremlins lurking about with tetanus-encrusted rusty nails like little murder hobos, including:

  • func.__name__ is only the unqualified basename of the decorated callable. Globally handling arbitrarily nested callables strung out across a skid row of both on-disk modules and in-memory non-modules warrants something a bit more... industrial-strength. In theory, something resembling the f-string f{getattr(func.__module__, "00-in_memory")}.{func.__qualname__} should suffice to uniquely identify all possible callables. This is non-obvious, so I'm contemplating a private utility getter resembling:
def _get_func_hashname(func: Callable) -> str:
    '''
    Shoot me now, fam.
    '''

    return 'f{getattr(func.__module__, "00-in_memory")}.{func.__qualname__}'
  • Thread-safety. Based on this dodgy decade-old discussion, defaultdict(list) should be thread-safe on at least CPython. Sadly, that guarantee doesn't appear to extend to anything else:

Other runtime environments than CPython (e.g. Jython, IronPython, PyPy) may or may not provide any guarantees on thread safety here.

So, we'll also need to declare our own private (but well-tested) BeartypeThreadSafeDefaultDict class locking internal operations away behind threading.(R)Lock context managers.

I am sighing fitfully into my hipster kombucha.

So What You're Saying Is...

Happiness remains an elusive dream for @leycec.

It's hard to be fully satisfied by any of the solutions on hand. We either:

  • Violate backward compatibility and user expectations but preserve robustness in the face of Murphy and His Law or...
  • Preserve backward compatibility and user expectations but violate robustness in one small (but non-zero) edge case that will blow up America's missile defence network in 2024.

This is why you don't force-install beartype in Minuteman silos.

Breaking the Fourth Wall

Do not crucify me without a warrant, but there's actually a fourth hidden option that preserves backward compatibility, user expectations, and robustness:

  • By default, @beartype detects when the decorated callable is overloaded (i.e., previously decorated by @typing.overload) and, if so:
    1. Emits a non-fatal warning.
    2. Reduces to the identity decorator (i.e., returns the decorated callable unmodified). This seems to be the only safe default choice. Overloaded callables cannot be safely type-checked without having access to all of the overloaded permutations of those callables – which, by default, we do not have. Doing nothing is better than doing something bad.
  • The beartype API provides three new attributes, enabling users to resolve these non-fatal warnings in whichever way suits their fetishistic craving for subtle bugs and angry bosses:
    • A beartype.patch_typing_overload() function... or something. When called, this function will (wait for it) monkey-patch @typing.overload with a private @beartype._typing_overload_patched decorator. This decorator resembles that earlier defined above. the following trivial one-liner, because the best things in life are truly free-as-in-easy:

      def patch_typing_overload() -> None:
          # Defer monkey-patched imports to call time.
          from typing import overload as overload_original
          import typing
      
          # Monkey-patch us up the bananas.
          typing.overload = lambda func:  # <-- lambda decorator omg
              overload_original(beartype_overload(func))
      
      • Since the user explicitly calls it, it's now the user's responsibility to ensure that function is called responsibly. Notably, beartype.patch_typing_overload() should be called sufficiently early and care should be taken to ensure that no other competing third-party package also monkey-patches @typing.overload. If someone does, everything should still work fine for vague definitions of "fine."
    • A @beartype.beartype_overload decorator... or something. This is more-or-less exactly as delineated above. Again, since the user explicitly decorates with it, it's the user's responsibility to ensure that decorator is called responsibly.

    • A beartype.beartype_everything() import hook. Since import hooks are AST-driven, that hook could effectively monkey-patch @typing.overload without actually monkey-patching @typing.overload. For example, we could internally reparse from typing import overload statements into from beartype import _typing_overload_patched as overload statements; alternately, we could intercept @typing.overload decoration attempts and dynamically insinuate ourselves into that workflow. Again, since the user explicitly calls that hook, it's now the user's responsibility to ensure that hook is called responsibly.

The line that threads through all three of these three beartype attributes is explicit. Everything's explicit. No one can complain! I mean, they will, of course. I'd complain too if my missile defense network blew up.

@TeamSpen210
Copy link

As you mentioned earlier, you'll still need to use beartype._util.func.utilfuncscope.get_func_locals() to ensure it's possible to do this in nested scopes. You could end up with multiple copies of a function-overload-group existing at once, and you don't want to continually gather up an endless stream of duplicate "alternatives". They should be identical, since static type checkers wouldn't be able to handle it, but beartype definitely could.

@leycec
Copy link
Member

leycec commented Oct 14, 2021

</sickening_knuckle_crackening_sound>

Right-o. I knew I was forgetting something from the above dissection of angry dragons. @beartype also needs to uncache previously cached RECORD entries – not only for the nested callable case but also for the global callable case, because in either case we're exhausting memory and now we're swapping to disk and oh gods why do i still not have an ssd:

from warnings import warn
def beartype(func: CallableT) -> CallableT:
    # Arbitrary string uniquely identifying this callable.
    func_hashname = _get_func_hashname(func)

    #FIXME: Also handle nested callables by inspecting the
    #call stack. Odin, hear your disciple's plaintive cry!

    # Detect both cases and then...
    if func.__globals__.get(func.__name__) is _typing_dummy_function:
        # If this is the uncommon case, cry @beartype a river.
        if func.__name__ in RECORD:
            raise BeartypeOverloadOrderException(
                f'Last overload of @typing.overload-decorated callable {func.__name__}() '
                f'not also decorated by @beartype.beartype_overload. '
                'Please apply @beartype_overload after @typing.overload: e.g.,\n'
                '    @overload\n'
                '    @beartype_overload\n'
                '    def {func.__name__}(): ...'
            )
        # Else, this is the common case. Cry until it stops hurting.
        else:
            warn(
                f'@typing.overload-decorated callable {func.__name__}() '
                f'not also decorated by @beartype.beartype_overload. '
                'Please apply @beartype_overload after @typing.overload: e.g.,\n'
                '    @overload\n'
                '    @beartype_overload\n'
                '    def {func.__name__}(): ...',
                BeartypeOverloadOrderWarning,
            )

    pprint([signature(f) for f in RECORD[func_hashname]])
    del RECORD[func_hashname]
    return func

So, del RECORD[func_hashname] in @beartype is the thing. Maybe that suffices to keep the ill space-complexity spirits at bay for another commit? 👻

@ruancomelli
Copy link
Author

@leycec your fourth option looks awesome to me. Like, really great. After connecting the wires correctly, this should work just perfect - if users opt-in for "automatic" patching, it is their responsibility to deal with the possible obscure errors that may arise in case of incompatibility. Otherwise, it's just a matter of remembering to @beartype_overload every @overloaded function - this doesn't look like a big deal.

Unfortunately, I'm kind of in a lack of time right now, so I will stop commenting on this (for now) and head on to share one more idea (our 5th so far). It will sound crazy, I know, but aren't we all mad here?

The idea is... Instead of monkey-patching typing directly, why not monkey-patch beartype itself - or actually monkey-patch a special beartype._typing module specially designed for this?

Just to be clear, this is the structure I'm using for this example:

.
├── beartype
│   ├── __init__.py
│   └── _typing.py
└── main.py

Here, beartype._typing is just an innocent module that will nothing but bring typing.overload to the game:

beartype/_typing.py

from typing import overload

Do you know what mypy & fellow type-checkers will think? That beartype._typing.overload is the exact same thing as typing.overload. Of course, they are - for now! The magic happens inside beartype.__init__. I wrote all explanations as comments, they should be clear enough for smart people like you guys to understand - but they were written in a hurry, so don't refrain from asking for clarification:

beartype/__init__.py

from collections import defaultdict
from inspect import signature
from pprint import pprint
from typing import Any, Callable, TypeVar

__all__ = ["beartype_function", "beartype_overload"]

AnyCallable = Callable[..., Any]
CallableT = TypeVar("CallableT", bound=AnyCallable)

# This RECORD dictionary is just part of a very simplified implementation!
# As pointed out in previous comments, it is not necessarily safe, but it works
# well for this example, so let's keep it as is.
RECORD: defaultdict[str, list[AnyCallable]] = defaultdict(list)


# Simple implementation of our brand-new `@beartype_overload` decorator
# Currently, it just adds functions to the RECORD dictionary based on function
# name. For a more real-world implementation, see previous comments.
def _beartype_overload(func: CallableT) -> CallableT:
    RECORD[func.__name__].append(func)
    return func


# I renamed `beartype` as `beartype_function` to avoid name conflict,
# but this is our good old friend `@beartype` - simulated here as a simple
# function that prints all signatures seen so far
def beartype_function(func: AnyCallable) -> AnyCallable:
    pprint([signature(f) for f in RECORD[func.__name__]])
    return func


# ^^^^^^^ YOU HAVE ALREADY SEEN THIS ^^^^^^^

# vvvvvvv NEW STUFF BELOW vvvvvvv

import beartype._typing

# A-HA! Type-checkers are oblivious to this runtime monkey-patching.
# Also, we are not monkey-patching the stdlib `typing` module, so no one gets
# sad here, and there are no conflicts with third-party patches.
beartype._typing.overload = _beartype_overload

# Now we just re-export `beartype._typing.overload` with a cute name like
# `beartype_overload`
from beartype._typing import overload as beartype_overload

Finally, our cute little main.py to act as witness to our cleverness:

main.py

from beartype import beartype_function, beartype_overload

@beartype_overload
def stringify(x: int) -> str:
    ...

@beartype_overload
def stringify(x: float) -> str:
    ...

@beartype_function
def stringify(x: int | float) -> str:
    return str(x)

Again, mypy is happy even if I execute mypy --disallow-redefinition main.py, and I get nice code completion. Yay!

@TeamSpen210
Copy link

That’s another good solution as well. Mypy will probably flag the assignment as illegal and/or potentially do something in the future, so you’ll probably need to guard with not TYPE_CHECKING or obfuscate the operation with setattr().

@leycec
Copy link
Member

leycec commented Oct 16, 2021

like, dude, woah

That's deviousness beyond all prior clinically understandings of deviousness. You just out-Machiavelli-ed mypy at its own game – and I for one welcome and nervously applaud our new compact @beartype_overload overlord.

However, as The Spenster observes with his formidable Sherlock-like powers of perception, mypy devs will consider this an attack on their core business model. They should and they will. If you can trivially circumvent --disallow-redefinition constraints with import reordering shenanigans, static type checking means less than they need it to mean.

I'm not reporting this upstream, because I don't want them to try resolving this. But someone will, because other people are like that. When this happens, the poorly concealed butterfly knives will come out with a disturbing scritching noise. A cool grindhouse blood vengeance scene choreographed by Tarantino then ensues.

I don't want mypy devs to hate us, because life here in the runtime trenches is already hard enough. They will resolve that loophole singularity – but they can't resolve every possible permutation, eh? We are Turing-complete and they aren't. As a fully foolproof, battle-hardened, last-ditch, End Times-ready defense against the living dead that surely walk amongst us, we might layer all of God-tier Spen's hacks into one mammoth hack. And it shall be known as...

Super Turbo Kludge:

# ^^^^^^^ YOU HAVE ALREADY SEEN THIS ^^^^^^^

# vvvvvvv NEW STUFF BELOW vvvvvvv

from typing import TYPE_CHECKING
import beartype._typing

if not TYPE_CHECKING:
    # A-HA! Type-checkers are oblivious to this runtime monkey-patching.
    # Also, we are not monkey-patching the stdlib `typing` module, so no one gets
    # sad here, and there are no conflicts with third-party patches.
    setattr(
        beartype._typing, (
            # Make this hexadecimal for mega bonus points.
            'o' +
            'v' +
            'e' +
            'r' +
            'l' +
            'o' +
            'a' +
            'd'
        ),
        _beartype_overload,
    )

    # Now we just re-export `beartype._typing.overload` with a cute name like
    # `beartype_overload`
    from beartype._typing import overload as beartype_overload

But all this begs the question...

Maybe It Should Just Be Public

PEP 585 is a problem for all type-checkers (both static and runtime), because it deprecates most of PEP 484 and thus the entire standard typing module, really. You are now wondering what that has to do with @typing.overload and might I perhaps not have stared directly into the Sun too much this afternoon?

Very well. Let's admit I did do that. But if you scan to the end of the prior link, you might notice that the optimal hot fix for PEP 585 is for everyone to define their own private _typing submodule ala:

# In "{your_package}._typing":
from sys import version_info

if version_info >= (3, 9):
    List = list
    Tuple = tuple
    ...
else:
    from typing import List, Tuple, ...

Wait. Wait just a minute there, Keanu! The right-brain pattern-matching synapses are firing with a dull ache in my forehead.

Above, Ruan cleverly suggested we define our own private beartype._typing submodule. Instead, consider defining a public beartype.typing module intended for everyone to externally import as a practical substitute for the increasingly volatile official typing module. Specifically, let's make this miracle happen...

# In our public "beartype.typing" submodule:
from beartype._overload import beartype_overload as _beartype_overload
from sys import version_info as _version_info
from typing import TYPE_CHECKING, overload

if _version_info >= (3, 9):
    List = list
    Tuple = tuple
    ...
else:
    from typing import List, Tuple, ...

if not TYPE_CHECKING:
    # A-HA! Type-checkers are oblivious to this runtime monkey-patching.
    # Also, we are not monkey-patching the stdlib `typing` module, so no one gets
    # sad here, and there are no conflicts with third-party patches.
    globals()[
        # Make this hexadecimal for mega bonus points.
        'o' +
        'v' +
        'e' +
        'r' +
        'l' +
        'o' +
        'a' +
        'd'
    ] = _beartype_overload

Everyone then imports typing attributes from beartype.typing rather than typing itself. @beartype.typing.overload behaves as expected, as do all PEP 484 attributes deprecated by PEP 585 (e.g., beartype.typing.List, beartype.typing.Tuple). No one needs to define their own private _typing submodules anymore, because we do so on everyone's behalf.

Of course, beartype will continue to politely accept typing attributes without complaint – except those for which we should continue to generate complaints, like the vanilla @typing.overload decorator and deprecated PEP 484 attributes (e.g., typing.List, typing.Tuple).

We now return to your regularly scheduled Friday night debauchery at the GitHub cantina. Cue "Cantina Song" on a sketchy jukebox.

@JelleZijlstra
Copy link

You may be interested in https://bugs.python.org/issue46821 where I propose adding runtime introspection support to @typing.overload. I hope that will also be enough for beartype. (I didn't read all of the above.)

@leycec
Copy link
Member

leycec commented Feb 21, 2022

Yes! Thanks so much generating my excitement on a Monday. It's a hard day to look forward to – but I now I do, thanks to @JelleZijlstra.

Deferring to a standard typing implementation would clearly be preferable for everyone involved. That would enable @beartype to focus on the genuinely interesting task of type-checking multiple overloaded callable signatures in parallel, which we never even got around to speccing out.

The Risky Plan Until Then

In the meanwhile, @beartype recently added a new mostly undocumented 😓 beartype.typing compatibility layer. Given that, we can now transparently add our own runtime-introspectable @beartype.typing.overload decorator (alluded to by Keanu above).

It's kinda unlikely a runtime-introspectable @typing.overload decorator will land in Python 3.11, because a PEP will probably need to be authored by somebody and then reviewed by everybody else. So, we probably won't get that decorator until 2024 at the earliest.

Let's see if @leycec makes this happen earlier for everyone in 2022! 🥳

@JelleZijlstra
Copy link

It's kinda unlikely a runtime-introspectable @typing.overload decorator will land in Python 3.11

This change should not need a PEP, since it only affects the runtime. We already added introspection support for @Final in 3.11 (https://bugs.python.org/issue46342). If the overload change is accepted, I'll also backport it to typing-extensions for the benefit of older versions.

@leycec
Copy link
Member

leycec commented Feb 22, 2022

Oh – you're quite right. Since I'm getting the creeping feeling that you usually are, I applaud everything you are and everything you do. Thanks so much for all your tremendous volunteerism throughout the community, Jelle.

Also, my resume will never look like this:

Core developer for Black, typeshed, typing-extensions.

...kinda in awe of that work ethic. I can barely snow-shovel our shed.

If the overload change is accepted, I'll also backport it to typing-extensions for the benefit of older versions.

👍 👍 👍

Such excitement. Given that, @beartype.typing.overload could then seamlessly expose runtime introspection to older Python versions for users also installing typing_extensions – which they'd better.

Magic like that only happens once in a generation.

@TeamSpen210
Copy link

An update: typing.get_overloads() was merged for both 3.11 and typing_extensions, so a beartype-specific version isn't as required.

@leycec
Copy link
Member

leycec commented Apr 16, 2022

Super-hype. I'm delighted I no longer need to do anything, because 2022 is hard enough. Let's rename this issue accordingly.

Thanks so much for inspiring and driving the details behind upstream CPython support, Spence! You're a living phenomena in the typing community.

@leycec leycec changed the title [Feature Request] Add a beartype.overload decorator to accept overloaded functions [Feature Request] Support typing.get_overloads() under Python ≥ 3.11 Apr 16, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants