Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix reloader for modules run as scripts #1336

Open
wants to merge 6 commits into
base: master
Choose a base branch
from

Conversation

lmmarsano
Copy link

Modules run as scripts with python -m flag crash for run(reloader=True), eg,

$ python -m servertestpackage.module wsgiref 8000 --reload
Traceback (most recent call last):
  File "/home/LOM0227/project/bottle/test/servertestpackage/module.py", line 3, in <module>
    import servertestpackage.absolute
ModuleNotFoundError: No module named 'servertestpackage'

bottle.py generates its own command missing -m and using the module's file path to spawn the child process, which breaks the import system.
To see the failures, checkout the previous commit in this branch & run the tests.

git checkout @^
python -m unittest test.test_server

Through some guesswork, these commits attempt to reproduce the original command, so that doesn't happen.

I'm unsure this is the best way to get the original command.
Is there a better way?

prepare test fixtures consisting of a package and module with relative & absolute import
test scripting server runs with these variations
- from package or from module
- with or without reloader
the server code is based off servertest.py
python doesn't have an obvious way to reveal the original command and module name, so we check the __main__ module for __package__ and infer it's a module running as a script unless the value is None, which implies a file path was provided
modules running as scripts are invoked with -m flags
packages running with scripts have base filename __main__.py
from this we recreate an invocation
@defnull
Copy link
Member

defnull commented Mar 26, 2021

I don't see a bug here that needs fixing. Let me quote https://docs.python.org/3/tutorial/modules.html#intra-package-references

Note that relative imports are based on the name of the current module. Since the name of the main module is always __main__, modules intended for use as the main module of a Python application must always use absolute imports.

So, relative imports are supposed to not work. And if servertestpackage cannot be found as a module, then the import path is not correct? I'm not sure if I understand the problem.

@lmmarsano
Copy link
Author

Are you saying that code that runs fine without reloader should crash with reloader?
These crashes are caused by the reloader not giving child processes the same command it received to start the main module.

The specs are more authoritative PEP 366, and that bit of advice in the tutorial only applies to unsupported versions of Python.
Since then, __package__ was invented to support explicit relative imports for the -m switch.

This will allow relative imports to work correctly from main modules executed with the -m switch.

Moreover, the semantics of -m differ from running a script directly.
-m imports the containing package & inserts the current directory (instead of the directory containing the main module/script) into sys.path (PEP 338).
Therefore, a correct reloader shouldn't treat loading the main module with or without -m interchangeably.
However,

bottle/bottle.py

Line 3671 in f9b1849

args = [sys.executable] + sys.argv

does: it runs a child process without -m using the main module's file path, resulting in these failures.

@defnull
Copy link
Member

defnull commented Mar 27, 2021

Are you saying that code that runs fine without reloader should crash with reloader?

No, I was saying that "I'm not sure if I understand the problem.". Thanks for the clarifying explanation.

So, it looks like python is lying to us when called with -m, because sys.executable and sys.args do not contain the actual executable or arguments. I did not know that. If I understand that correctly, this is not an issue for single-file modules, because python3 -m module and python3 module.py both work as expected. It is only an issue for packages with a __main__.py because: a) if called directly, the __package__ variable is not set and relative imports do not work and b) sys.path is initialized differently.

$ python3 -mpkg
__name__ =  __main__
__package__ =  pkg
sys.path =  ['/tmp', ...]
sys.executable =  /usr/bin/python3
sys.argv =  ['/tmp/pkg/__main__.py']

$ python3 /tmp/pkg/__main__.py
__name__ =  __main__
__package__ =  None
sys.path =  ['/tmp/pkg',  ...]
sys.executable =  /usr/bin/python3
sys.argv =  ['/tmp/pkg/__main__.py']

In short: If we detect a -m invocation (if __name__ == '__main__' and __package__ is not None) then we should replace sys.argv[0] with ["-m", __package__] to get the expected behavior. Correct?

@defnull
Copy link
Member

defnull commented Mar 27, 2021

Nah, that would have been too easy. Bottle needs to check __package__ of the __main__ module if it is not itself the main module. So your more complex workaround is really necessary. Hmm.

@defnull
Copy link
Member

defnull commented Mar 27, 2021

Sooo... would that work?

args = [sys.executable] + sys.argv
if getattr(sys.modules.get('__main__'), '__package__', None):
    # If a package was loaded with `python -m`, then `sys.argv`
    # is wrong and needs fixing in some cases. See #1336
    args[1:1] = ["-m", sys.modules['__main__'].__package__]

Reasoning: There is always a __main__ module. If its __package__ is None (direct call) or empty (-m with a single-file module) then no fixing is needed. Calling the script directly is fine. If it has a value, replacing python3 /path/to/package/__main__.py with python3 -m package restores the desired behavior.

Edit: This is basically what you did, but a bit shorter. Did I miss some edge-case here?

@lmmarsano
Copy link
Author

It is only an issue for packages with a __main__.py

Close: non-package modules are affected, too.
That's where I originally encountered the issue.
If you checkout the previous commit & run the tests, then you'll see 2 failures:

  • package reload (ie, script __main__.py)
  • module reload (script basename matches module name)

I included the package test case for thoroughness.

In the module case, I'd think you need the full name: __package__ would only provide the parent package's.

Even in the single file case, sys.path is affected, and my knowledge isn't exhaustive.
If the script doesn't operate on information that depends on the -m switch, then the behavior with or without it is the same.

About the reasoning, I'd try to confirm all assumptions in the specs, and test them even then.

  • __main__ always exists: Complete Python programs appears to support that
  • __package__: Import-related module attributes covers the module cases, and PEP 366 covers the filename case ('When the main module is specified by its filename, then the __package__ attribute will be set to None.')
    • filename: __package__ is None
    • top-level module: __package__ == ''
    • package: __package__ is the package's name
    • submodule: __package__ is the parent package's name

In light of this, my approach misses the top-level case (-m toplevel).
The function I wrote could be replaced with a constant computed at most once.
It also feels like a kludge.
I'd prefer retrieving the original call: however, Python doesn't seem to offer an obvious way, and I haven't dug too deeply.

these will support top-level module script test cases
arrange package script to import submodule
new test cases check the top-level module name is prepared correctly
module tests cases are renamed submodule for clarity
refactor function to return main script arguments
@defnull
Copy link
Member

defnull commented Jul 7, 2021

It's been a while and I (accidentally) pushed the (broken) partial solution I did a while ago to master now :/

My current best guess for a complete solution is this monster:

def _fixed_argv():
    main = sys.modules.get('__main__')
    package = getattr(main, '__package__', None)
    if package is None: # python3 /path/to/file.py
        return sys.argv

    filepath = getattr(main, '__file__', sys.argv[0])
    module = os.path.splitext(os.path.basename(filepath))[0]
    if module == '__main__': # python3 -m package
        return ["-m", package] + sys.argv[1:]
    if package == '': # python3 -m module
        return ["-m", module] + sys.argv[1:]
    # python3 -m package.module
    return ["-m", package + "." + module] + sys.argv[1:]

Lots of code for an edge case that was never really supported. And even with that, it still breaks if the app changes the working directory (as the test server scripts do). Perhaps we should drop support for -m completely and tell users to either run scripts by path, or use bottle.py --reload package.module or python3 -m bottle --reload package.module. These both work, because bottle is always a runnable file and never a package.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants