Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FRAME_OWNED_BY_CSTACK breaks greenlet introspection; can CPython help? #113037

Open
oremanj opened this issue Dec 13, 2023 · 2 comments
Open

FRAME_OWNED_BY_CSTACK breaks greenlet introspection; can CPython help? #113037

oremanj opened this issue Dec 13, 2023 · 2 comments
Labels
type-bug An unexpected behavior, bug, or error

Comments

@oremanj
Copy link

oremanj commented Dec 13, 2023

Bug report

Bug description:

The popular greenlet package implements cooperative multitasking by moving parts of the C stack around. The active greenlet has all of its stack in the expected place, but a suspended greenlet might have spilled part of its stack to the heap in order to allow the active greenlet to use the same region of stack. For many years, this has worked fine in practice because storage on the stack is generally not reachable from Python objects on the heap. The introduction with #96319 of interpreter frames stored on the C stack broke this assumption; under Python 3.12 and later, if you can get ahold of a frame object from a suspended greenlet, you can crash the interpreter by following f_back links until you reach one that would traverse an entry frame. A workaround was added in python-greenlet/greenlet@40646dc but it only protects the innermost greenlet frame (which is the easiest one to access since greenlets provide a gr_frame attribute to retrieve it), severing its link with the rest of the greenlet stack when the greenlet is suspended. This hampers the ability to understand what a suspended greenlet is doing, and it doesn't even completely resolve the crash because there are other ways to obtain a non-innermost greenlet frame.

I filed python-greenlet/greenlet#388 against greenlet to discuss ways greenlet could work around the C-stack-based interpreter frames. None of the options are really palatable; they all involve taking new dependencies on CPython internals, as well as some tradeoff between unsoundness (exposing frame objects whose f_back attribute will crash the interpreter when accessed) and poor performance (needing to walk the stack on every greenlet suspend/resume). I'm wondering if there's anything that could be done on the CPython side to better support this use case.

The easiest solution from greenlet's perspective would be to just not store interpreter frames on the C stack. It appears likely feasible to store the entry frames on the per-thread frame stack instead; to maintain stack discipline, the entry frame for evaluating an owned-by-thread frame would need to be allocated before the owned-by-thread frame, but that doesn't look like a blocker (in fact both could be allocated simultaneously). Another option would be to use a single static interpreter frame object for all entry frames, and to store their previous pointers (the only portion that definitely needs to be variable from one entry frame to the next) on a new per-thread stack. Since entry frames return using a different bytecode instruction than non-entry frames, this wouldn't introduce additional branching in the eval loop, only in frame introspection (the f_back getter, etc).

Another category of potential solution would still keep entry frames on the C stack, but would store enough information in the interpreter frame object under evaluation that it would be able to skip its entry-frame parent without accessing any portion of it. The easiest approach here would be to add a new previous_heap pointer (name for discussion purposes only) which is like previous but skips entry frames; but that's increasing the size of the interpreter frame structure, which might not be acceptable. If taking that size bump is OK then the rest of the solution is trivial; just make f_back follow previous_heap instead of previous.

Maybe someone who's more familiar with interpreter internals than I am can come up with an option that's better than any of these. But it would be really useful for greenlet if we could somehow eliminate the recently-introduced requirement to access the C stack in the course of walking the Python stack. Thanks for your consideration.

CPython versions tested on:

3.12, 3.13, CPython main branch

Operating systems tested on:

Linux

@oremanj oremanj added the type-bug An unexpected behavior, bug, or error label Dec 13, 2023
@terryjreedy
Copy link
Member

@markshannon Shim frame are an issue for greenlets.

@oremanj
Copy link
Author

oremanj commented Dec 17, 2023

For reference, I submitted a PR to greenlet that resolves this on their side: python-greenlet/greenlet#393

I think it will work robustly enough, at least until the next big change to how frames are represented, but it's taking a number of dependencies on CPython internals so I'd still like to explore any possible upstream changes that would make this easier to maintain.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type-bug An unexpected behavior, bug, or error
Projects
None yet
Development

No branches or pull requests

2 participants