Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature Request] Generalize beartype.claw.beartype_this_package() to transparently support unpackaged scripts as an internal fallback #320

Open
komodovaran opened this issue Feb 1, 2024 · 6 comments

Comments

@komodovaran
Copy link

komodovaran commented Feb 1, 2024

My setup is as follows:

- /
  |-- src/
  |   |-- __init__.py
  |   |-- script.py
  |   |-- lib.py
  |
  |-- test/
      |-- test_script.py

And I run my script (using PyCharm, which does not really like -m it seems) the script-y way as

python src/script.py

Additionally, my sys.path contains the full path to / and /src so I can go wild with imports. Whether this is kosher or not is beside the point. The thing is that it works. I can import lib into script, and I can import both script and lib into test_script without breaking any qualified names anywhere.

I tried to add the beartype magical hooks everywhere to see if it would make an aging codebase compatible out of the box. In the init. In the file. I also tried with both src.script, src. Yet none of it works.

...Except if I run the script like python -m script from a separate terminal. This is certainly doable but it's enough friction that a bunch of developers with already-configured hotkeys in their editor would just... not use beartype. We're talking decades of accumulated muscle memory here.

So what can I do? Is this a feature request that hasn't been raised before or am I trying to do the impossible?

@leycec
Copy link
Member

leycec commented Feb 2, 2024

...heh. Painting outside the lines, huh? At this point, most Python projects would probably start screeching that "you're doing it wrong," your team is morally bankrupt, and everyone should feel bad.

Thankfully, we disagree. This is @beartype – the Fightin' Man's Type-checker. We strongly support your rebelliousness here. I myself live in a claptrap shed in the Canadian wilderness. I voluntarily run Gentoo Linux. It don't get more fightin' man. Of course, it's not simply fightin' man ethics; you're also grappling with crippling technical debt and exploding maintenance burden. You can't refactor, because PyCharm hates you, your coworkers grimace every time the new guy commits well-intended but ultimately broken improvements, and like not being homeless.

Thankfully, I've also tested @beartype against your real-world use case. Congratulations to all of us! @beartype behaves as expected and fully supports what you want to do. We now breath a feels-good sigh of relief. 😌

You are now thinking:

"...the hell you say? I tried literally everything. I threw the leaky kitchen sink at friggin' @beartype. This is why I can confidently say that @beartype sucks. It's broken. Also, I hate it."

Let @beartype Massage Those Worries Away

So, I replicated your ad-hoc-yet-awesome project structure with random gibberish of my own devising, like so:

# In "src/script.py":
def _hack_sys_path() -> None:
    '''
    Explicitly register all files and subdirectories of the root directory
    containing this top-level ``setup.py`` script to be importable modules and
    packages (respectively) for the remainder of this Python process if this
    directory has yet to be registered.
    '''

    # Avert thy eyes, purist Pythonistas!
    import os, sys

    # Absolute dirname of this directory inspired by this StackOverflow answer:
    #     https://stackoverflow.com/a/8663557/2809027
    setup_dirname = os.path.join(
        os.path.dirname(os.path.realpath(__file__)), '..')

    # If the current PYTHONPATH does *NOT* already contain this directory...
    if setup_dirname not in sys.path:
        # Print this registration.
        print(f'Registering project "{setup_dirname}" for importation...')

        # Append this directory to the current PYTHONPATH.
        sys.path.append(setup_dirname)

# Kludge us up the bomb.
_hack_sys_path()

# Subject this package to beartype-based runtime-static type-checking.
from beartype.claw import beartype_package
beartype_package('src')

# Import type-checked nonsense. Then call type-checked nonsense.
from src.lib import broke_af
print(broke_af())

And then:

# In "src/lib.py":
def broke_af() -> int:
    return "Broke af func ain't so dope af."

That's it. Our src/script.py (A) adds the src package to sys.path, (B) type-checks the src package with @beartype, and (C) imports and calls a type-checked function defined in your src.lib submodule. The type-checking violation from @beartype speaks for itself:

$ python src/script.py  # <-- do it, @beartype. do it for johnny.
Registering project "/home/leycec/tmp/pogo/src/.." for importation...
Traceback (most recent call last):
  File "/home/leycec/tmp/pogo/src/script.py", line 45, in <module>
    print(broke_af())
          ^^^^^^^^^^
  File "<@beartype(src.lib.broke_af) at 0x7fd27c6589a0>", line 24, in broke_af
beartype.roar.BeartypeCallHintReturnViolation: Function src.lib.broke_af() return
"Broke af func ain't so dope af." violates type hint <class 'int'>, as str
"Broke af func ain't so dope af." not instance of int.

We see that our sys.path hack succeeded and that @beartype successfully type-checked your src.lib.broke_af() function by raising a violation when that function returns an unexpected string rather than an expected integer.

No One Asked for This Monologue

The core takeaways here are:

  • Deferred type-checking. Python import hooks (like beartype.claw.beartype_package()) only apply to subsequently imported modules (like src.lib). They do not apply to the current module (like src/script.py). That's not @beartype's fault; that's just how Python import hooks work, I'm afraid. This means that you'll probably want to refactor most of the contents of your src/script.py script out into a new private submodule (like src.lib or perhaps even a new src.__main__ submodule or whatevah) that you then subsequently import from src/script.py. Your src/script.py script is really masquerading as a src.__init__ submodule. Which brings us to...
  • Your src.__init__ submodule is mostly useless. This is sad. The whole raison d'être for your src.__init__ submodule is that Python implicitly imports src.__init__ on the first importation of anything from your src package. But you've chosen (wisely, of course) to circumvent Python package standards. So, Python isn't doing that for you. Instead, you have to do all of the stuff you'd normally do in src.__init__ in your src/script.py script. This includes installing @beartype import hooks. Since src.__init__ is only generating confusion here, a reasonable decision would just be to move everything from src.__init__ into src/script.py; lastly, add a banner comment to src.__init__ that that submodule is intentionally empty and everybody should see src/script.py instead. Sadness continues.
  • beartype_this_package() is also useless. Because you call src/script.py as a script rather than as a module residing in a package, there is no "this package." Ergo, beartype_this_package() fails with an unreadable exception beartype.roar._BeartypeUtilCallFrameException: Callable "__main__.<module>()" package not found (i.e., top-level module or script "__main__" resides outside any parent package). Instead, you have to call beartype_package() with the explicit package name 'src'.

Painting outside the lines is, indeed, fraught with pain. Personally, I'd:

  1. Shift the guts of src/script.py into a new src.__main__ submodule.
  2. Gradually encourage everyone to transition to running your script as python -m src.

That's probably the optimal long-term solution, honestly. Why? It preserves backward compatibility in the short term (good) while also preserving standard Python package semantics in the long term (gooder) and avoiding bothersome issues like this for your team forevermore (goodest).

Thanks for All the Fishy Code

This isn't so much a feature request or outstanding issue as it is an AMA on the intersection of Python packaging, scripting, and import hooks. That's not necessarily a bad thing, of course. This is a fascinating conundrum. But it does kinda lie outside the purview of community standards and norms. Most Pythonistas would regard this sort of thing as a code smell, code debt, and devtool antipattern.

Thanks for the intriguing puzzle, @komodovaran. Let me know if you've got any other questions, concerns, or compulsions to break @beartype over your knee. Until then, let's quietly close this to prevent anyone else from getting any similarly bright ideas. 😉

@leycec leycec closed this as completed Feb 2, 2024
@leycec
Copy link
Member

leycec commented Feb 20, 2024

You were totally right, @komodovaran! @beartype was communicating poorly with everybody about this. I hang my head and sigh publicly. 😮‍💨

More importantly, I did something about this. Exception messages raised by the beartype.claw.beartype_this_package() import hook now acknowledge your super-valid use case. They even offer helpful advice on how to rectify this. Behold: an intimidating mountain of words.

$ python3.12 src/script.py
beartype.roar.BeartypeClawHookUnpackagedException: Top-level script
"/home/leycec/tmp/pogo/src/script.py" resides outside package structure.
Consider calling another "beartype.claw" import hook. However, note that only
other modules will be type-checked. "/home/leycec/tmp/pogo/src/script.py" itself
will remain unchecked. All business logic should reside in submodules
subsequently imported by "/home/leycec/tmp/pogo/src/script.py": e.g.,
    # Instead of this at the top of "/home/leycec/tmp/pogo/src/script.py"...
    from beartype.claw import beartype_this_package  # <-- you are here
    beartype_this_package()                          # <-- feels bad

    # ...pass the basename of the "src/" subdirectory explicitly.
    from beartype.claw import beartype_package  # <-- you want to be here
    beartype_package("src")  # <-- feels good

    from src.main_submodule import main_func  # <-- still feels good
    main_func()                   # <-- *GOOD*! "beartype.claw" type-checks this
    some_global: str = 0xFEEDFACE  # <-- *BAD*! "beartype.claw" ignores this
This has been a message from your friendly neighbourhood bear.

You've probably already resolved this on your end – or just abandoned @beartype altogether. Still, @beartype now tries a bit harder to help. Thanks so much for getting us started on this. 😅

leycec added a commit that referenced this issue Feb 20, 2024
This commit significantly improves exceptions raised by the
`beartype.claw.beartype_this_package()` import hook for common use
cases, resolving both issue #320 kindly submitted by Copenhagen
superstar @komodovaran (Johannes Thomsen) and forum thread #330 kindly
submitted by long-time `typing` fiend @JWCS (Jude). Specifically,
`beartype_this_package()` now raises exception messages resembling
the following when called directly from:

* Top-level scripts:

  ```python
  beartype.roar.BeartypeClawHookUnpackagedException: Top-level script
  "/home/leycec/tmp/src/script.py" resides outside package structure.
  Consider calling another "beartype.claw" import hook. However, note that only
  other modules will be type-checked. "/home/leycec/tmp/src/script.py" itself
  will remain unchecked. All business logic should reside in submodules
  subsequently imported by "/home/leycec/tmp/src/script.py": e.g.,
      # Instead of this at the top of "/home/leycec/tmp/src/script.py"...
      from beartype.claw import beartype_this_package  # <-- you are here
      beartype_this_package()                          # <-- feels bad

      # ...pass the basename of the "src/" subdirectory explicitly.
      from beartype.claw import beartype_package  # <-- you want to be here
      beartype_package("src")  # <-- feels good

      from src.main_submodule import main_func  # <-- still feels good
      main_func()                   # <-- *GOOD*! "beartype.claw" type-checks this
      some_global: str = 0xFEEDFACE  # <-- *BAD*! "beartype.claw" ignores this
  This has been a message from your friendly neighbourhood bear.
  ```

* Top-level modules:

  ```python
  Top-level module "main.py" resides outside package structure but was
  *NOT* directly run as a script. "beartype.claw" import hooks require
  that modules either reside inside a package structure or be directly
  run as scripts. Since neither applies here, you are now off the deep
  end. @beartype no longer has any idea what is going on, sadly.
  Consider directly decorating classes and functions by the
  @beartype.beartype decorator instead: e.g.,
      # Instead of this at the top of "main"...
      from beartype.claw import beartype_this_package  # <-- you are here
      beartype_this_package()                          # <-- feels bad
  \n'
      # ...go old-school like it's 2017 and you just don't care.
      from beartype import beartype  # <-- you want to be here
      @beartype  # <-- feels good, yet kinda icky at same time
      def spicy_func() -> str: ...  # <-- *GOOD*! @beartype type-checks this
      some_global: str = 0xFEEDFACE  # <-- *BAD*! @beartype ignores this, but what can you do
  For your safety, @beartype will now crash and burn.
  ```

Unrelatedly, this commit also revises our ReadTheDocs (RTD)-hosted FAQ
entry on `pytest-beartype` to document configuration via standard
`pyproject.toml` files, resolving issue #327 kindly submitted by
spaghetti-loving Bay Area pasta guru @jamesbraza (James Braza).

(*Watery warts on a tart rotisserie!*)
@JWCS
Copy link

JWCS commented Mar 15, 2024

@leycec @komodovaran
Well, this Minnesotan winter was much too short, and we have an excess supply of anti-brainmelt.
Behold!

  • Note the immediate usage I tested this on was a dockerized application, such that all sources were in directories like
    • /app/lib1/__init__.py, /app/scripts/__init__.py
    • And with ENV PYTHONPATH=/app, no individual path mangling is needed

Convert yonder given executable /app/scripts/ex.py from:

#!/usr/bin/env python
from lib1 import lib1_fn
from scripts import helper
some_global: str = 0xFEEDFACE
def main():
    print("Where was the beartype!?")
if __name__ == '__main__':
    import sys
    main(*sys.argv)
    # Note, this is implicit: sys.exit(0)

To the incredibly scandalous ordering of:

#!/usr/bin/env python
if __name__ == '__main__':
    # This self-import is req to trigger ../__init__.py, and beartype!
    from scripts import ex as this_module
    import sys
    this_module.main(*sys.argv)
    sys.exit(0) # Implicit in the other case
from lib1 import lib1_fn
from scripts import helper
some_global: str = 0xFEEDFACE
def main():
    print("Note, this shouldn't run, bc of the beartype exception on some_global")

Oh, did I mention the __init__.py is the same as what's on the beartype tin?

#!/usr/bin/env python
# scripts/__init__.py
from beartype.claw import beartype_this_package
beartype_this_package()

Because I saw the above solution... and nah fam

@leycec
Copy link
Member

leycec commented Mar 16, 2024

WOAH. You have just entered the event horizon of typing, @JWCS. Beyond this point of no return, the new guy at work has inserted his white-knuckled fist into his mouth and is now contemplating living under a bridge. Congrats.

Seriously, though. How did you even think of that "special" darkness? If self-importing the script you're currently running isn't the Python Highway to Hell™, I don't know what is. And I know what is. This is like the @beartype codebase all over again.

Because I saw the above solution... and nah fam

😆 😄 🥲 😢 😭

I'm tempted to officially document this somewhere. Then I remember that I would be personally responsible for destroying all Python scripting. I slowly back away from the edge. Gibbering whispers can be heard from the yawning abyss below. ...hurk gurgle...

@JWCS
Copy link

JWCS commented Mar 16, 2024

It was pretty easy:

  • Importing works great
  • I could import into another main file... but thats an extra file
  • You can check if you're imported or running by name==main
  • We're already doing that
  • Drop a solution, run for the lakes, hope the coworkers never catch on, get paid to do so

For my projects, the ratio of scripts to helper files is 50/50, so I'll probably be rolling this out everywhere 😬
Thankfully the boon of actually having beartype check half of what I'm doing outweighs the small dissonance of style.

But, you speak of event horizons? Truly eldritch implementations? DID HE ASK FOR MORE
There's the potential feasibility that this functionality could be implemented in a
from beartype.claw import beartype_this_executable
beartype_this_executable(__file__, __name__)
With a rough workings like: (stolen from importlib example)

# But, we could further do away with the args by invoking `inspect`
def beartype_this_executable(file_path, module_name):
    spec = importlib.util.spec_from_file_location(module_name, file_path)
    module = importlib.util.module_from_spec(spec)
    module.__name__ = '__main__'
    spec.loader.exec_module(module)
    sys.exit(0)

But yeah, a "For now here are the 'ok-ish' options" bit of documentation might be helpful; this is a bit buried (yet existing).

@leycec
Copy link
Member

leycec commented Mar 20, 2024

It was pretty easy:

proceeds to list five hard bullet points

Drop a solution, run for the lakes, hope the coworkers never catch on, get paid to do so

"Drop a solution, run for the lakes" <-- quoted for truth

The next time my wife begs me to do something about the rotting food and dish piles, my one-liner is preprepared.

But, you speak of event horizons? Truly eldritch implementations? DID HE ASK FOR MORE

I know not what I speak of! I recant! I dissemble! No, not that! Anything but... the eldritch abominaeeeeeeeeeeeeeee— 💀

There's the potential feasibility that this functionality could be implemented in a...

Actually, that's ingenious. Let's reopen this as a new feature request to improve the fetid intelligence of beartype_this_package(). In theory, beartype_this_package() should be able to internally incorporate the black magic of beartype_this_executable() by incorporating the diabolical automation you suggested to "further do away with the args by invoking inspect."

beartype_this_package() would then have two internal modes of operation:

  1. The current package-oriented mode.
  2. If that fails, fallback to the script-oriented mode outlined by your beartype_this_executable().

I see no fault with black magic. Let thy minions of darkness arise, Overlord @JWCS! Geh heh heh heh.

@leycec leycec reopened this Mar 20, 2024
@leycec leycec changed the title Automagic for Python's script runner? [Feature Request] Generalize beartype.claw.beartype_this_package() to transparently support unpackaged scripts as an internal fallback Mar 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants