Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-87533: Expand pickle importing to support non-package C-modules #119152

Open
wants to merge 9 commits into
base: main
Choose a base branch
from

Conversation

ringohoffman
Copy link

@ringohoffman ringohoffman commented May 18, 2024

There have been recurring issues with PyModule_Create modules in PyTorch. When trying to serialize attributes of these C-modules, pickle fails to import the C-module because they are not a packages.

This is the current issue that brought this to my attention:

The existing hack to get around this has been to insert the C-module into sys.modules in order to enable pickle to find them. Instead of relying on this hack, we can change pickle's approach to loading, which is currently equivalent to import package.c_module.

Instead, we can do what equates to from package import c_module when the module is not a package. This is nice because:

  1. does not care if c_module is a package or not
  2. is fully backward compatible with the previous approach
  3. slots in nicely to the fromlist parameter of __import__, which we are already using to load modules in pickle (albeit a bit strangely since we aren't currently using the return value of __import__, which can also be tied us not currently using fromlist).

We still retrieve the module from sys.modules instead of by attribute on its parent module when it is a package. It can lead to some rather unexpected behavior when trying to manipulate sys.modules to control available packages as we do in some existing tests.

Copy link

cpython-cla-bot bot commented May 18, 2024

All commit authors signed the Contributor License Agreement.
CLA signed

@bedevere-app
Copy link

bedevere-app bot commented May 18, 2024

Most changes to Python require a NEWS entry. Add one using the blurb_it web app or the blurb command-line tool.

If this change has little impact on Python users, wait for a maintainer to apply the skip news label instead.

There have been recurring issue with PyModule_Create modules in PyTorch; when trying to serialize attributes of these C-modules, pickle fails to import the C-module because it is not a package

This is the current issue that brought this to my attention: pytorch/pytorch#126154

The existing hack to this issue has been to insert the C-module into sys.modules in order to enable pickle to find them: https://github.com/pytorch/pytorch/pull/38136/files#diff-d7e90d0f94b43db763b44fba679a5c1b4cabe3668aaf34f2aee07de8e2d1b2faR524-R528

Instead of relying on this hack, we can change `pickle`'s  approach to loading, which is currently equivalent to `import package.c_module`; instead, we could do `from package import c_module`, which 1) does not care if `c_module` is a package or not 2) is fully backward compatible with the previous approach and 3) slots in nicely to the `fromlist` parameter of `__import__`, which we are already using to load modules in `pickle`
@ringohoffman ringohoffman force-pushed the picklable-c-module-attributes branch from 9b8377e to 52a2a20 Compare May 18, 2024 19:30
@bedevere-app
Copy link

bedevere-app bot commented May 18, 2024

Most changes to Python require a NEWS entry. Add one using the blurb_it web app or the blurb command-line tool.

If this change has little impact on Python users, wait for a maintainer to apply the skip news label instead.

@bedevere-app
Copy link

bedevere-app bot commented May 18, 2024

Most changes to Python require a NEWS entry. Add one using the blurb_it web app or the blurb command-line tool.

If this change has little impact on Python users, wait for a maintainer to apply the skip news label instead.

@ringohoffman ringohoffman force-pushed the picklable-c-module-attributes branch from f095f11 to 1831907 Compare May 18, 2024 20:08
@bedevere-app
Copy link

bedevere-app bot commented May 18, 2024

Most changes to Python require a NEWS entry. Add one using the blurb_it web app or the blurb command-line tool.

If this change has little impact on Python users, wait for a maintainer to apply the skip news label instead.

@bedevere-app
Copy link

bedevere-app bot commented May 18, 2024

Most changes to Python require a NEWS entry. Add one using the blurb_it web app or the blurb command-line tool.

If this change has little impact on Python users, wait for a maintainer to apply the skip news label instead.

@ringohoffman
Copy link
Author

ringohoffman commented May 18, 2024

Definitely some pretty interesting stuff going on in: https://github.com/python/cpython/blob/main/Lib/test/test_datetime.py

Testing outside of this file, I can't seem to reproduce this behavior so I think the errors are specific to whatever is going on in test_datetime.py.

EDIT: Each of fast_tests and pure_tests pass independently. I am pretty sure something funky is happening with the manipulation of sys.modules and import_fresh_module.

Also remove incorrect notes and unnecessary steps
…rent

Otherwise you might set a value for the module in sys.modules, but have it go unused and instead be accessed through the parent
@ringohoffman
Copy link
Author

ringohoffman commented May 22, 2024

@serhiy-storchaka can I ask for your opinion on this? I see you have been one of the most active contributors to pickle for some time now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant