New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Experiment: Allows to introspect Python modules from cdylib #3977
base: main
Are you sure you want to change the base?
Conversation
Thanks for moving this forward! The idea of using custom data sections is new to me. I see the upsides of it, though I am slightly worried by the extra complexity of needing to worry about linker details in yet another way.
I agree that having the macros generate file(s) is unlikely to be the right solution 👍
For the library which converts the metadata to In fact, I wonder if for option (3) explored here then the using a test to generate and update the
If we agree that using a test to update stubs is a good solution, then I think the choice between runtime code like (2) and data segments like (3) is probably just influenced by whatever is easier for us to implement. We might even be able to swap back and forth between these two options as an implementation detail as we learn. |
Yes! To have played a bit with both, runtime code like (2) is way easier (the difference between this MR and #2454 is quite significant). If I try to summarize the pros of each approach: Approach 3: add introspection to the cdylib and let maturin write the stubs on build:
Approach 4: use a test to write/update stubs:
|
You make a very good point that automated tests to update a One thing that I expect is that Overall I don't have a good sense of whether option 3 or 4 is better. In an ideal world we might offer both options. Which one do you think would meet your needs better at present? Maybe we start by implementing that and we learn a lot by doing so! |
I would love to avoid people to handwrite customization in stubs because it makes automatically updating the stubs when Rust code is changed very hard. Imho automatically updating stubs to reflect changes in the Rust code is the main value proposition of auto-generating stubs in the first place. An idea: Add entry points in PyO3 macros to extend the stubs. For example (rought idea, not sure about the actual details): #[pymodule]
#[py(extra_stub_content = "
K = VarType('T')
V = VarType('V')
")]
mod module {
#[pyclass(stub_parent = [collections::abc::Mapping::<K,V>])]
struct CustomDict;
#[pymethods]
impl CustomDict {
#[pyo3(signature = (key: K), return_type = V | None)]
fn get(key: &Bound<'_, PyAny>) -> Option<Bound<'_, PyAny>> {
}
}
} would generate K = VarType('T')
K = VarType('V')
class CustomDict(collections.abc.Mapping[K,V]):
def get(key: K) -> V | None: ... This way stubs would stay auto generated but can be improved by the author. A possible way to mix options 3 and 4:
|
Agreed that having the proc macros be able to collect all the necessary information would be nice. I think only time will tell whether they can meet all user needs! I'm slightly wary of coupling to cc @messense do you see any concerns with adding this to So what's next steps here? Do you want me to start reviewing this code, or will you push more first? Regarding the data sections, I happened to hear yesterday that UniFFI's proc macros can do something similar about shipping definitions in the shared library, so it might be interesting to look at / ask them how that was implemented. |
No concern, I think a |
Thank you!
Yes! My hope is to cover as many as possible.
Additionaly to
Thank you! I'm going to have a look at it.
I think the current draft already shows the relevant direction, a very high level code review to check if it's going in the good direction would be welcome. Maybe wait for me to have a look at UniFFI, I might change a bit this MR if I find interesting things there. Thank you! |
This is very exciting, looking forward to being able to generate type stubs! Currently we have this lengthy and hard to maintain Python script for doing so, which we have to update by hand: https://github.com/Chia-Network/chia_rs/blob/main/wheel/generate_type_stubs.py This would be a major improvement. Happy to help out however I can (testing, implementation, whatever) as time allows, to hopefully get this out the door 😄 |
@Rigidity Thank you! I plan to work on this MRs to get the basics done. Then there will be a lot of features to incrementally add on it (support for all PyO3 features...) so help will coding and testing will be much welcome! |
bed5bc2
to
e82fc35
Compare
e82fc35
to
cb4cfa7
Compare
Sorry for the very long reaction delay (a lot of priorities + vacations).
I had a look at uniffi, they basically use the same approach as us: embedding the metadata in the binary and then parsing the binary using the same @davidhewitt If you have time, may you have a quick look at the MR to see if the global design goes into a good direction? If yes, I will fix a lot of shortcut I took and get the MR ready for review. |
This is a prototype for discussion. Code is not clean, buggy, and only a few features is implemented.
A missing piece in the story listed in #2454 is how tools like Maturin move the introspection information generated how by PyO3 into to type stubs files included in the built wheels.
I see three approaches for it:
pyo3-macros
generate a file with the stubs after having processed all macros of the crate. This has the advantage of being self-contained in the crate but falls short in cases like python classes declared in a crate but exposed in an other crate: there is no guarantees that proc macros of a crate and its dependencies are compiled in the same process and that proc macros will still be able to write files in the future (like with proposal to run them a WASM sandbox).__pyo3_stubs_MODULE_NAME
function However, for the build system to execute it, a compatible Python interpreter must be present to link with and a compatible CPU or VM to run it, making generation when doing cross-compilation very hard. I guess it's what Python Interface (.pyi) generation and runtime inspection #2454 was heading toward.Architecture:
pyo3_data0
section that contains a JSON "chunk" with the introspection data. Code inpyo3-macros-backend/src/introspection.rs
. I had to do some bad hack to generate the segments properly via Ruststatic
elements.PYO3_INTROSPECTION_ID
constants, allowing the code building the JSON chunk to get the global id of eg. a classC
viaC::PYO3_INTROSPECTION_ID
. This allows chunks to refer to other chunks easily (eg. to list module elements). A bad hack is used to generate the ids (TypeId::of
would have been a nicer approach but is not const on Rust stable yet).0
at the end ofpyo3_data0
is a version number, allowing breaking changes in the introspection format.pyo3-introspection
crate parses the binary usinggoblin
(library also used by Maturin), fetches all thepyo3_data0
segments (only implemented for macOS Match-O in this experiment), and builds nice data structures.pyo3-introspection
crate would implement ato_stubs
function converting the data structures to Type stubs.pyo3-introspection
has an integration tests doing introspection of thepyo3-pytests
library.Experiment limitations (ie TODO if we want to move this forward):
static
elements is awful.FromPyObject::type_input
into an associated constant or a const function and similarly forIntoPy::type_output
. This is mandatory in order to allow to make use of them in thestatic
values added to the generated binary.