Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature Request] Transitive type-checking suppression ala @typeguard.suppress_type_checks #336

Open
AlanCoding opened this issue Mar 4, 2024 · 3 comments

Comments

@AlanCoding
Copy link

My objective is to add runtime type-checking to a library in ansible/django-ansible-base#191

https://beartype.readthedocs.io/en/latest/faq/#how-do-i-not-type-check-something

To spell this out, I have a test test_hash_serializer_data_difference, and this test calls a method hash_serializer_data. The test passes DATA to this method, which is a dict when an object of some other class was expected.

@nobeartype
def test_hash_serializer_data_difference():
    """Test hashing different data changes the hash"""
    assert hash_serializer_data(DATA, DataSerializer) != hash_serializer_data({**DATA, **{"id": 4567}}, DataSerializer)

That still fails in the call to hash_serializer_data

I didn't write this test. I honestly don't even want to think about what this test does. What is important is that:

  • I am sure this method hash_serializer_data will call another method which also has type checking
  • the test is "wrong" passing DATA as a dict, but presumably has some useful means that it is pursuing
  • I'm left with a solid 9 cases like this, where the test calls are wrong. Even if I might be able to fix this test, the others are very likely unfixable tests.

The docs section I link says:

This the common use case for temporarily blacklisting a callable or class.

This seems consistent with the fact that I did @nobeartype and it still failed, because the test method was allowed to take incorrectly typed args, but the methods the test called are not allowed to do this. I also used the other @no_type_check and got same result.

Request: don't check anything under a context manager

I want to turn off all type checking for an entire test. I still want to run the tests, because presumably, there is a reason someone added them. Turning off type checking for these 9 tests would allow me to turn type checking on for the other 700-some tests in the library.

I think typeguard works for this, see ansible/django-ansible-base#177, @suppress_type_checks seems to ignore type checks globally for the duration of the test.

@leycec
Copy link
Member

leycec commented Mar 5, 2024

...wow. This is extraordinarily fascinating. The takeaway is that the real world is brutal, huh? You're just trying to death march through a weekend's worth of open-source crunch without blowing an O-ring. Meanwhile, @beartype is oh-snapping in the air and shooting you down with sassy snark like:

"Oh, no you didn'!"

What I'm trying to say here is that you want to do is understandable – yet non-trivial. Lying to static type-checkers like mypy and pyright is a given; they hallucinate about everything, so users need some means of lying to them. Lying to runtime type-checkers like @beartype is another matter altogether; they mostly don't hallucinate, so users ideally should avoid trying to lie to them. Ideally.

That said... this is the real world. Idealism doesn't matter much when 9 friggin' unit tests lay in a shattered heap of rubber tires that is currently on fire.

Let's start from the beginning.

My objective is to add runtime type-checking to a library in ansible/django-ansible-base#191

🥳

I didn't write this test. I honestly don't even want to think about what this test does.

We've all been there. I too know this gut-wrenching pain. I didn't even get a T-shirt for my agony.

the test is "wrong" passing DATA as a dict, but presumably has some useful means that it is pursuing

...is it, though? "Wrong" is definitely being air-quoted here. The test previously passed. Presumably, the test knew what it was doing back then. If the test otherwise still passes in the absence of runtime type-checking, surely the test still knows what it is doing?

Alternatives to suppressing @beartype at test-time include:

  1. Lightly refactoring the original signature of hash_serializer_data() to unconditionally accept dictionaries as well: e.g.,
from beartype.typing import Dict, Union

def hash_serializer_data(data_arg: Union[SomeClass, Dict[str, object]], ...) -> ...: ...
  1. Lightly refactoring the original signature of hash_serializer_data() to conditionally accept dictionaries only when testing with pytest: e.g.,
from beartype.typing import Dict, Union
from sys import modules

# If tests are currently running, allow tests to pass dictionaries below.
# Else, prohibit dictionaries from being passed below.
DataArgHint = (
    Union[SomeClass, Dict[str, object]] if 'pytest' in modules else SomeClass)

def hash_serializer_data(data_arg: DataArgHint, ...) -> ...: ...
  1. Lightly refactoring hash_serializer_data() to conditionally disable type-checking only when testing with pytest: e.g.,
from beartype import beartype, BeartypeConf, BeartypeStrategy
from beartype.typing import Dict, Union
from sys import modules

# If tests are currently running, disable type-checking; else, don't.
maybebeartype = beartype(conf=BeartypeConf(strategy=(
    BeartypeStrategy.O0 if 'pytest' in modules else BeartypeStrategy.O1)))

@maybebeartype
def hash_serializer_data(data_arg: Union[SomeClass, Dict[str, object]], ...) -> ...: ...

Honestly, I'd go with the second option. You're still runtime type-checking, which is good. You're no longer trying to lie to the runtime type-checker, which is good. You're explicitly declaring exactly what you're doing to the next New Guy™ five cubicles down, which is good.

But let's say you hate all of those things. You will bend @beartype to your iron will! 💪

This seems consistent with the fact that I did @nobeartype and it still failed, because the test method was allowed to take incorrectly typed args, but the methods the test called are not allowed to do this. I also used the other @no_type_check and got same result.

Exactly. You've got it exactly. Decorators like @nobeartype and @no_type_check only apply to the exact class or callable being decorated. They don't transitively apply to all other classes or callables called by the class or callable being decorated. Emoji cat cries. 😿

You're actually the first user to request this. You're definitely not wrong to want a "temporarily halt all runtime type-checking for the duration of this context manager, unit test, or whatevahs" button. This is a fascinating (and thus valid) use case. But... @beartype probably can't give you what you want. You're in uncharted and shark-infested waters. I will now wade in with some bloody chum.

The issue, really, is that unit tests aren't isolated. Unit tests all run in the same Python interpreter. It's likely that one or more previously run unit tests in your test suite have already imported the hash_serializer_data() method called by your test_hash_serializer_data_difference() test. If that's not the case now, that could certainly be the case in the future. But the point I'm failing to make is that hash_serializer_data() has already been defined and then presumably redefined by @beartype to perform runtime type-checking.

That's already happened. Now, you want that to unhappen. That's hard. In other words, what you really want is your test_hash_serializer_data_difference() test to:

  1. Dynamically unload (i.e., un-import) the module defining hash_serializer_data(). This is hard. Officially, Python doesn't support module unloading. Unofficially, various frameworks (especially web frameworks like Django) have implemented various ad-hoc "hot module reloading" functionality. Sometimes it works. Sometimes it doesn't. But unloading modules from pytest unit tests in a reliable, robust, and portable manner is basically infeasible. It is a hard conundrum.
  2. Temporarily disable @beartype in your test_hash_serializer_data_difference() test. In theory, this is easy. Just call beartype_this_package(conf=BeartypeConf(strategy=O0)) or whatevahs, right?
  3. Re-import the module defining hash_serializer_data() with @beartype disabled. Again, this is easy.

It's the module unloading part that's hard. Icky stuff, honestly. You might temporarily get that to sorta kinda maybe work under various edge cases – but it'll definitely explode when you least expect it at 3:43AM on Friday evening. That's not where you want to be in life.

I think typeguard works for this, see ansible/django-ansible-base#177, @suppress_type_checks seems to ignore type checks globally for the duration of the test.

...heh. I have been educated. You're amazing – and so is typeguard, frankly. I'd actually never heard of @typeguard.suppress_type_checks before. That's a mesmerising API he's conjured up there.

Technically, @beartype could implement a similar API. Pragmatically, @beartype can't. Why? Efficiency. @beartype soft-guarantees pseudo-realtime behaviour with no measurable overhead – or overhead less than ~1µs (i.e., one microsecond, one millionth of a second). For all intents and purposes, ~1µs means "no measureable overhead." That's an extreme but extremely valuable constraint.

typeguard is the exact opposite. typeguard is comparatively much less constrained. typeguard isn't quite as obsessed with efficiency as @beartype, which gives it a lot more leeway to get away with this sort of thing.

Unlike typeguard, @beartype dynamically generates optimally efficient type-checking code at decoration time. Implementing a comparable @beartype.suppress_type_checks API would require that @beartype wrap all generated type-checks in some sort of if conditional detecting whether that type-check should be short-circuited by suppression from a parent test. But 99.99% of users would never short-circuit or enable test suppression. So, that if conditional would just needlessly slow down everyone else in the world for an edge case that nobody ever asked for before.

Still... it's an intriguing idea. If test suppression could (hypothetically speaking) be implemented without impacting non-test performance, that would be a clear win for everyone and something worth doing.

Let me consider this as I stroke my beard stubble. Still, it's probably best to just repair the test and/or type hint to accurately reflect what's actually going on. 🤔

@leycec leycec changed the title Feature question: How do I NOT type check _anything_ for duration of a function [Feature Request] Transitive type-checking suppression ala @typeguard.suppress_type_checks Mar 5, 2024
@TeamSpen210
Copy link

There is another place beartype could put the check - in get_hint_object_violation() (or just before calling it). Then regular usage would be unaffected, but illegal calls permitted by this suppression would have the penalty of failing the checks, then continuing. This potential API should probably be rarely used, since it's kinda bad practice to be violating your type hints. So it's probably fine to pay that small penalty.

Alternatively, another approach would be to change the way this feature would function. Instead of suppressing all errors, it'd require you to pass in the specific callable you want to suppress. Beartype would use its internal knowledge to generate a new function which just skips all the type checking. It could then swap out the __code__ value of the function for the duration of this context manager. It might even be possible to share this no-op code object, if the number/ order of closure cells is always the same.

@leycec
Copy link
Member

leycec commented Mar 6, 2024

Galaxy Brain @TeamSpen210 for the win. @TeamSpen210, where have you been all my coding life!?!?

I no longer need to think or cogitate or ruminate or ponder the Big Questions™. Instead, I summon @TeamSpen210 into every discussion I'm failing at... somehow. That's the hard part. Summoning @TeamSpen210 is a non-deterministic process. I don't how it happens. The magic, although miraculous, is unreliable at best. Summoning questions include:

All these questions will remain unanswered. I beg and I plead. Yet, all I get is @TeamSpen210 spontaneously erupting from the thin membrane that separates worlds and solving everything.

Back to this question before I black out from the excitement.

There is another place beartype could put the check - in get_hint_object_violation() (or just before calling it). Then regular usage would be unaffected, but illegal calls permitted by this suppression would have the penalty of failing the checks, then continuing.

That's... ingenious. How did you think of that? How did you even know what get_hint_object_violation() is? Who are you, really!? I'm pretty sure you're GitHub Batman at this point. Please don't hunt me down. Your identity is safe with me, unless I just publicly disclosed it live on GitHub.

We're agreed this is egregiously clever – but it's also not quite that simple. Okay, it's pretty simple – but I'd better document exactly what I need to do here for my future self when he gets back to this in twelve years. Consider this stupendous example:

from beartype import beartype, BeartypeConf

@beartype(conf=BeartypeConf(is_debug=True))
def i_heart_cats(cat_scratch_fever: str) -> int:
    return cat_scratch_fever

Given that, @beartype currently generates type-checking code that looks something like this: WARNING: this is super-boring.

(line 0001) def i_heart_cats(
(line 0002)     *args,
(line 0003)     __beartype_get_violation=__beartype_get_violation, # is <function get_func_pith_violation at 0x7fbc9b543d80>
(line 0004)     __beartype_conf=__beartype_conf, # is "BeartypeConf(is_debug=True)"
(line 0005)     __beartype_func=__beartype_func, # is <function i_heart_cats at 0x7fbc9b954900>
(line 0006)     **kwargs
(line 0007) ):
(line 0008)     # Localize the number of passed positional arguments for efficiency.
(line 0009)     __beartype_args_len = len(args)
(line 0010)     # Localize this positional or keyword parameter if passed *OR* to the
(line 0011)     # sentinel "__beartype_raise_exception" guaranteed to never be passed.
(line 0012)     __beartype_pith_0 = (
(line 0013)         args[0] if __beartype_args_len > 0 else
(line 0014)         kwargs.get('cat_scratch_fever', __beartype_get_violation)
(line 0015)     )
(line 0016) 
(line 0017)     # If this parameter was passed...
(line 0018)     if __beartype_pith_0 is not __beartype_get_violation:
(line 0019)         # Type-check this parameter or return against this type hint.
(line 0020)         if not isinstance(__beartype_pith_0, str):
(line 0021)             __beartype_violation = __beartype_get_violation(
(line 0022)                 func=__beartype_func,
(line 0023)                 conf=__beartype_conf,
(line 0024)                 pith_name='cat_scratch_fever',
(line 0025)                 pith_value=__beartype_pith_0,
(line 0026)             )
(line 0027) 
(line 0028)             raise __beartype_violation
(line 0029)     # Call this function with all passed parameters and localize the value
(line 0030)     # returned from this call.
(line 0031)     __beartype_pith_0 = __beartype_func(*args, **kwargs)
(line 0032) 
(line 0033)     # Noop required to artificially increase indentation level. Note that
(line 0034)     # CPython implicitly optimizes this conditional away. Isn't that nice?
(line 0035)     if True:
(line 0036)         # Type-check this parameter or return against this type hint.
(line 0037)         if not isinstance(__beartype_pith_0, int):
(line 0038)             __beartype_violation = __beartype_get_violation(
(line 0039)                 func=__beartype_func,
(line 0040)                 conf=__beartype_conf,
(line 0041)                 pith_name='return',
(line 0042)                 pith_value=__beartype_pith_0,
(line 0043)             )
(line 0044) 
(line 0045)             raise __beartype_violation
(line 0046)     return __beartype_pith_0

The point to my future self is that the get_hint_object_violation() getter underlying the __beartype_get_violation() calls above unconditionally returns an exception. We have two choices here:

  1. As @TeamSpen210 suggests, we could just avoid calling __beartype_get_violation() entirely when test suppression is enabled. This approach produces elegant code and is definitely what I'd do if this code wasn't dynamically generated. But... this code is dynamically generated. Detecting test suppression from within this code would require unconditionally passing a new "hidden" (i.e., optional unpassed) __beartype_is_suppressed parameter whose value is a is_beartype_suppressed() tester function defined elsewhere. That's kinda awkward. I'm unclear at the moment whether adding hidden parameters to a function marginally increases the time cost of calling that function or space cost of storing that function. I could profile that, but... I'm lazy. It's probably fine. Let's do something else, though. Just 'cause.
  2. As @TeamSpen210 suggests, we could internally refactor get_hint_object_violation() to detect test suppression and respond "accordingly," where "accordingly" means "do something profoundly insane that doesn't make any sense." I like this approach. No new hidden parameters. Only insane and undocumentable behaviour. For example, I could lightly refactor get_hint_object_violation() to return None when test suppression is enabled. This is the way to spaghetti code. Therefore, I will do this.

Since 2. is the way, @beartype will need to generate slightly expanded type-checking code that looks something like this:

(line 0001) def i_heart_cats(
(line 0002)     *args,
(line 0003)     __beartype_get_violation=__beartype_get_violation, # is <function get_func_pith_violation at 0x7fbc9b543d80>
(line 0004)     __beartype_conf=__beartype_conf, # is "BeartypeConf(is_debug=True)"
(line 0005)     __beartype_func=__beartype_func, # is <function i_heart_cats at 0x7fbc9b954900>
(line 0006)     **kwargs
(line 0007) ):
(line 0008)     # Localize the number of passed positional arguments for efficiency.
(line 0009)     __beartype_args_len = len(args)
(line 0010)     # Localize this positional or keyword parameter if passed *OR* to the
(line 0011)     # sentinel "__beartype_raise_exception" guaranteed to never be passed.
(line 0012)     __beartype_pith_0 = (
(line 0013)         args[0] if __beartype_args_len > 0 else
(line 0014)         kwargs.get('cat_scratch_fever', __beartype_get_violation)
(line 0015)     )
(line 0016) 
(line 0017)     # If this parameter was passed...
(line 0018)     if __beartype_pith_0 is not __beartype_get_violation:
(line 0019)         # Type-check this parameter or return against this type hint.
(line 0020)         if not isinstance(__beartype_pith_0, str):
(line 0021)             __beartype_violation = __beartype_get_violation(
(line 0022)                 func=__beartype_func,
(line 0023)                 conf=__beartype_conf,
(line 0024)                 pith_name='cat_scratch_fever',
(line 0025)                 pith_value=__beartype_pith_0,
(line 0026)             )
(line 0027) 
(line 0028)             # If test suppression is *NOT* currently enabled, raise an exception.  # <-- NEW MADNESS
(line 0029)             if __beartype_violation is not None:
(line 0030)                 raise __beartype_violation
(line 0031)             # Else, test suppression is enabled. In this case, silently ignore this violation.
...

Trivial. No efficiency concerns. Only one new line of generated code. Mild refactoring of get_hint_object_violation(). Sure, there's a bit of steely-eyed madness here. But I for one welcome undocumentable, undebuggable, and unmaintainable refactorings that increase technical debt beyond the event horizon of volunteerism.

Thus, I praise @TeamSpen210. 👏 🌝

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants