Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Python snapshots: Only load dynamic libraries that are needed #1872

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

hoodmane
Copy link
Contributor

@hoodmane hoodmane commented Mar 20, 2024

Rather than actually preloading all libraries, we just preallocate space for them. There is a function called getMemory that determines the location of the dynamic libraries and nothing else so we can patch this to ensure that the libraries get allocated in their dedicated location if they are loaded at all. This allows us to ensure that their metadata always lands in the right spot. We also make sure to load all the libraries in the correct order so that they end up in the correct spot in the function pointer table.

There is also the possibility that someone could use ctypes and mess everything up. ctypes also doesn't work with our snapshots before this PR. It could be fixed by patching libffi to record the trampoline address and function table slot into the DSO_METADATA and then recreate them the same way when restoring the snapshot. We'd also need to record the function table base for each loaded library. Once we do all this, we should be safe again...

@hoodmane hoodmane marked this pull request as ready for review March 20, 2024 23:00
@hoodmane hoodmane requested review from a team as code owners March 20, 2024 23:00
@dom96
Copy link
Collaborator

dom96 commented Mar 21, 2024

Since this breaks the memory snapshot format, it's a good time to add a version header to the memory snapshots so we can more easily support multiple versions of them.

@dom96
Copy link
Collaborator

dom96 commented Mar 21, 2024

I also wonder if we could use capnp for the memory snapshot in JS

@hoodmane hoodmane force-pushed the hoodmane/prealloc-dynlibs branch 2 times, most recently from 6d4f003 to 585386e Compare March 21, 2024 13:40
Nonfunctional change. Rename reportError.js into util.js and move
simpleRunPython there. Also run formatter.
Rather than actually preloading all libraries, we just preallocate space for
them. There is a function called `getMemory` that determines the location of the
dynamic libraries and nothing else so we can patch this to ensure that the
libraries get allocated in their dedicated location if they are loaded at all.
This allows us to ensure that their metadata always lands in the right spot.

We still have to make sure to load all the libraries in the correct order. There
is also the possibility that someone could use ctypes and mess everything up.
Though ctypes doesn't work with our snapshots before this PR. I guess it could
be fixed by patching libffi to record the trampoline address and function
table slot into the DSO_METADATA and then recreate them the same way when
restoring the snapshot. We'd also need to record the function table base for
each loaded library. Once we do all this, we should be safe again...
@jasnell jasnell added the python Issues/PRs relating to Python Workers label Apr 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
python Issues/PRs relating to Python Workers
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants