Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

async fn tracking issue #1632

Open
davidhewitt opened this issue May 25, 2021 · 43 comments · Fixed by #3540
Open

async fn tracking issue #1632

davidhewitt opened this issue May 25, 2021 · 43 comments · Fixed by #3540

Comments

@davidhewitt
Copy link
Member

davidhewitt commented May 25, 2021

This issue is a placeholder issue which I'd like to use to keep track of what we need to do to support async fn for #[pyfunction] (and #[pymethods]).

While this would be a great feature to have, I think anything which we merge into PyO3 core should be simple to use, performant, and (ideally) async-runtime agnostic.

Users have started researching the design space of how possible cross-language implementations could be written. The most advanced candidate is pyo3-asyncio. See the guide as a great example of how to use it: https://pyo3.rs/latest/ecosystem/async-await.html

At the moment I suggest we allow some time for that and other crates to mature, and once we understand performance trade-offs / difficulties etc. we can consider upstreaming something here.

As this is supported by third-party crates I think we can afford patience for now.

@awestlake87
Copy link
Contributor

pyo3-asyncio still needs some time and development to mature. There are currently a few things that I think need to be addressed:

  • Documentation and testing for pyo3-asyncio within native modules - seems like many users need this to get started
  • Cleaner initialization - better asyncio configuration, potentially some AlreadyInitialized errors, etc.
  • Stream <-> async generator conversions - my first draft had some performance concerns
  • Cleaner error handling - no complaints yet, but I have a few nitpicks here and there

I've been taking a bit of a passive approach to this library lately to see what users need it for, and it seems like these are the main concerns right now.

@JohnScience
Copy link

For the time being, using async conditionally can be enabled with qualifier_attr.

@wyfo
Copy link
Contributor

wyfo commented Oct 2, 2023

I've just released pyo3-async (see reddit).
Its approach is completely different from pyo3-asyncio as it is runtime-agnostic on Rust side (on Python side, it needs to know the runtime to yield the correct objects, but the runtime don't need to run to instantiate coroutines/async generators).
I've also added the support for trio, as it's quite popular (and supporting it was simple enough, contrary to curio for example).
Last but not least, it supports async generators (simply reusing its coroutine implementation for __anext__), and cancellation.

Adding a macro for async fn is on my TODO list, but I think I have to discuss about it here first, whether or not the implementation could be good enough to be quoted in the documentation.

EDIT: This crate is just a one week POC (actually the first time I use PyO3 ever). It lacks advanced tests, documentation examples, benchmarks, etc. I just wanted to explore the event loops interoperation, and because it worked fine, I've written this crate to submit it to your feedback.

@wyfo
Copy link
Contributor

wyfo commented Oct 9, 2023

I've run some quick benchmarks to compare both pyo3-async and pyo3-asyncio. Results are as expected:

  • for wrapping of already ready future, performances are not comparable (more than 10x faster, but can be even worse if event loops are busy). This is because pyo3-async doesn't spawn task both on Python and Rust side, so successful polling is done synchronously on Python await operation, which become synchronous too. On the other side, pyo3-async will always be asynchronous.
  • you can also wrap spawned tokio task in pyo3-async coroutine, so you can emulate pyo3-asyncio behavior, and thus you get comparable performances.

Here is a simple example illustrating the first point:

#[pymodule]
fn example(_py: Python<'_>, m: &PyModule) -> PyResult<()> {
    m.add_class::<Sender>()?;
    Ok(())
}

#[derive(Clone)]
#[pyclass]
struct Sender(tokio::sync::mpsc::Sender<()>);

impl Sender {
    async fn future(self) -> PyResult<()> {
        self.0.send(()).await.ok();
        Ok(())
    }
}

#[pymethods]
impl Sender {
    #[new]
    fn __new__(capacity: usize) -> Self {
        let (tx, mut rx) = tokio::sync::mpsc::channel(capacity);
        pyo3_asyncio::tokio::get_runtime().spawn(async move { while rx.recv().await.is_some() {} });
        Self(tx)
    }

    fn send_async(&self) -> asyncio::Coroutine {
        asyncio::Coroutine::from_future(self.clone().future())
    }

    fn send_asyncio<'py>(&self, py: Python<'py>) -> PyResult<&'py PyAny> {
        pyo3_asyncio::tokio::future_into_py(py, self.clone().future())
    }
}
import asyncio
import example  # built with maturin
import time

CAPA = 100
N = 10000

async def send_async():
    sender = example.Sender(CAPA)
    start = time.perf_counter()
    for _ in range(N):
        await sender.send_async()
    print("pyo3-async", time.perf_counter() - start)

async def send_asyncio():
    sender = example.Sender(CAPA)
    start = time.perf_counter()
    for _ in range(N):
        await sender.send_asyncio()
    print("pyo3-asyncio", time.perf_counter() - start)
    
for _ in range(3):
    asyncio.run(send_async())
    asyncio.run(send_asyncio())
    print("===============")

P.S. I'm currently implementing a pyasync macro, and I've some interesting issues about GIL-bound parameters, but I think I've found an elegant solution (with a bit of unsafe behind).

@wyfo
Copy link
Contributor

wyfo commented Oct 10, 2023

I've publish a new release of pyo3-async with #[pyfunction]/#[pymethods]!

@davidhewitt Would you mind taking a look at this crate? Could it be worth to mention it in PyO3 documentation?

@davidhewitt
Copy link
Member Author

@wyfo sorry for the slow reply, this looks like a very interesting development! I'm reading through the code today and will let you know what I think ASAP!

@davidhewitt
Copy link
Member Author

Overall I think it looks like a great crate, though I would suggest you add some CI to test with multiple Python / OS versions. Hopefully our testing in CI here means that you don't have any platform-specific quirks, but I wouldn't guarantee that.

This would definitely be welcome of a mention in the PyO3 documentation. I also suspect there are common parts which can be shared with pyo3-asyncio (and both projects may benefit from sharing what they can for interoperability's sake).

At the moment I think the key API reason which would prevent pyo3-async from being brought into the main PyO3's package would be that you have a couple of #[pyclass] types to represent various coroutines and generators on the Python side. The downside of this is that these end up as unique types in each Python extension module compiled using pyo3-async. Apart from PanicException, which exists out of necessity (and we have discussed a lot, we have avoided adding #[pyclass] types into PyO3 itself.

This prompted me to ask the CPython core devs in the Python discourse how they might want us to deal with these #[pyclass] types. Let's see what they say.

A couple of more specific thoughts:

  • It's very nice to see a runtime-agnostic solution to how to bind Rust and Python futures.
  • The fact you need to re-wrap the PyO3 macros is unfortunate though also understandable. I wonder if there's something which can be done to avoid the need for you to do this? At least from a cursory scan of the implementation you're not overly coupled to the PyO3 macro implementations.

@wyfo
Copy link
Contributor

wyfo commented Oct 13, 2023

CI is indeed on the TODO list, my first goal was write the code to illustrate the potential of this approach.

I don't think however that there are common parts with pyo3-asyncio, because both crates don't operate on the same level, as pyo3 fully relies on runtimes. But both return Python object or trait implementation, so nothing should prevent to work with both crates at the same time.

Actually, I did not know about this #[pyclass] issue, and that's indeed unfortunate. Funny enough, pyo3-asyncio has the same issue, but on Python-to-Rust side, while my crate is concerned on Rust-to-Python side.
Well, the answers to your post on Python forum are interesting, particularly the Cython hack. It may indeed be a thing for PyO3 coroutine implementation (but I'm still hoping that CPython will fix the coroutine ABI in the future).

For the macro part, I've hesitated to do a unique #[pyasync] macro to decorate both PyO3 #[pyfunction] and #[pymethods], but I've finally preferred re-wrapping them. But I'm not really coupled with their implementation, the only thing I need is to know if there is a #[pyo3(name = "...")] attribute, because PyO3 doesn't support having two of them.
In fact, another design decision was to generate an additional function returning the coroutine (to be decorated with #[pyfunction], or inside #[pymethods] impl) and keeping the original async one, or to modify the function in place. Because I don't like to modify code in macro, I've choosen the first solution, but the generated function needs then to have a #[pyo3(name = "...")] to match the original (if it doesn't already have one).
Also, the goal was to showcase a possible integration in PyO3, so re-wrapping was more suited. But a #[pyasync] thing would be ok for me too.

@davidhewitt
Copy link
Member Author

Yes, it sounds like if Cython has set a precedent with its Coroutine types it seems reasonable to me for us to do something similar. @adamreichold and I have talked a lot about this kind of problem in the past, see #3073. At the time we were leaning towards a pyo3_runtime package, but maybe there's strengths of both approaches for different parts of the problem.

Maybe if we can sketch out a summary of all the pieces that we'd need to add to PyO3 to support async fn, and also what the limitations of our support would be, we could make a decision on whether we now have enough design knowledge to be confident in adding this feature.

Main concerns I'd have would be on:

  • interactions between the GIL and async code
  • any performance limitations which we know about in this design
  • how to ensure PyO3 core remains async runtime agnostic
  • how applications already using pyo3-asyncio can build upon PyO3's async fn

If it all fits together well, maybe with this latest development we get to a point where async fn sugar is supported in PyO3 and pyo3-asyncio can build upon that to add advanced helpers and async runtime integrations? (And we also perhaps bring pyo3-asyncio under the pyo3 org, if @awestlake87 wants that.)

@wyfo would you be interested in collecting that together, as this is freshest in your mind? (And if we think it makes sense, implementing? 👀)

@wyfo
Copy link
Contributor

wyfo commented Oct 15, 2023

After a second reflexion about this #[pyclass] issue, I've changed my mind quite a bit. Actually, several libraries defines their own exceptions, so why not consider PanicException as a per-library defined exception, that should be exported with the rest of the Python objects? I will open an new issue to discuss this particular topic.

Regarding coroutines, what would be the issue of having one coroutine type per extension? (You have this https://github.com/python/cpython/blob/84b7e9e3fa67fb9b92088d17839d8235f1cec62e/Lib/asyncio/coroutines.py#L38, but it's not really an issue). Coroutines are only meant to be awaited, not used with == or isinstance, and you can still use isinstance(..., collections.abc.Coroutine) or asyncio.iscoroutine (which use isinstance), or even inspect.isawaitable. So I quite agree with PetrViktorin.

About async fn support:

  • Python async ecosystem is not normalized, contrary to Rust's one with the Future trait. There are multiple async backends, with incompatible implementation; asyncio is of course the main one, because standard, but there is also trio, which is quite popular now, especially with the anyio project, allowing him to be supported in trendy projects like Starlette/FastAPI. Should PyO3 only support asyncio, or multiple backends (with asyncio as default, of course)?
    In pyo3-async, I've chosen to support both asyncio and trio, but also a lazy specialized implementation using sniffio, following the anyio trend. (I did not found a way to support curio, and didn't look to other backends)
  • Python coroutines are similar to Rust futures, both are suspendable computations. With asyncio, a Rust Future can "simply" be wrapped into a Python awaitable, which will yield asyncio.Future instantiated on demand, and using the asyncio.Future.set_result as wake method for a custom std::task::Waker. (That's what is implemented in *pyo3-async).
    However, because Python async backends are single-threaded, wake operation must use thread-safe method, e.g. call_soon_threadsafe. (I've added a quick optimization in pyo3-async, I store the thread id at the future creation and check the current thread id in wake method to determine if call_soon_threadsafe must be called)
    The direct binding between Python and Rust async executions is necessary, at least to have decent performance, because it allows synchronous execution of futures already ready (see async fn tracking issue #1632 (comment))
  • There is still an important difference between Rust and Python async world: cancellation. In Python, cancellation is an exception that can be caught and even ignored, there can be asynchronous cleanup, while in Rust, you can only drop your future an cleanup with synchronous Drop. That's why Python async cancellation is not equivalent to just dropping a Rust future. A good design should give a way to properly handle Python cancellation.
    In pyo3-async, coroutine wrapper has an optional throw callback, allowing to react to cancellation exception (for example using a mpsc::UnboundedSender). This is the best design I've found, because simple and not tied to a particular library.
  • Python awaitable doesn't require to be a coroutine, it just need to implement __await__ (i.e. to be awaitable). However, in pyo3-async, I've chosen to implement directly the coroutine protocol (which imply the awaitable one) because async backends like asyncio or trio don't support raw awaitables (asyncio automatically wraps them in a dummy coroutine), so it seemed to me that it should be more performant, but I've not tested this. It's also possible to implement the generator protocol without implementing __await__, by using types.coroutine wrapper, but I suspect this call could have performance impact; I still have to test it.
  • With direct binding between Rust and Python asynchronous execution (see second point), Rust Future is always polled with GIL acquired. However, to be embedded in a Python object, the future must be Send + 'static, so it's not possible for it to capture pyo3::marker::Python or any &pyo3::types::PyAny (#[pyfunction] parameters are thus also concerned), and to hold them across an await boundary. However, because the GIL is held, Python::with_gil is only one thread-local check, so it seems to be an acceptable safety cost, with unsafe Python::assume_gil_acquired for the zero-cost alternative.
    By the way, having the GIL held for Future polling allows to use it to convert the Future result to a Python type, so it doesn't have to be in the future code, allowing them to return a Result<T, E> where T: IntoPy<PyObject>, PyErr: From<E>.
    In pyo3-async, this is materialized by the PyFuture trait and its blanket implementation.

I don't know any performance limitation about this design for now. It is also completely agnostic of any Rust async runtime, and I don't see a reason to change that. In fact, it may be more efficient to spawn a big future into a tokio task, to have it polled exclusively on Rust side, and to wrap the task handle into a Python coroutine, but this optimization is just few lines of code and doesn't require first-class support IMO. But PyO3 needs to expose the Coroutine wrapper to allow this kind of use case (and cancellation support).

Regarding pyo3-asyncio, async fn support/Coroutine wrapper should completely replace future_into_py, while still allowing them to wrap their futures in a tokio/async-std task as mentioned above.

Here is what I can say for now. For the implementation, I would pretty much copy-paste the content of pyo3-async 😇

@davidhewitt
Copy link
Member Author

Thank you, there's a lot of detail here and it's taken me a little while to chew on it.

I think you're right that the Coroutine type can be an internal part of PyO3 and that will be what we would want async fn to return from #[pyfunction] and #[pymethods] to return. It's probably ok to have this compiled into each extension as per the Python discourse thread, and if we want to optimize it to reuse for the same PyO3 version from multiple extensions that can always come later.

I'm a little nervous around the complexity of all the different Python async backends. I sort of feel like it's worth us supporting asyncio only in PyO3 itself and then leaving other backends for separate crates, but it depends really on how easy it is to define abstractions which let us do that.

As a first step, let's try merging the Coroutine type along with a simple example? I'll learn a lot from reviewing that, I think.

@wyfo
Copy link
Contributor

wyfo commented Nov 14, 2023

I've written a new POC this weekend based on the current state of #3540, but also #3540 (comment).

Here are the improvements compared to the previous implementation:

  • coroutine __name__/__qualname__
  • warning when non-awaited coroutine are dropped (same as Python async def coroutine)
  • generic waker with multiple async backend support (asyncio, trio and anyio (with sniffio))
  • support for &self/&mut self in #[pymethods]
  • #[pyo3(allow_threads)] for async fn (and regular fn for coherence) with a generic parameter in Coroutine::new
  • cancellation support with CancelHandle and #[pyo3(cancel_handle = "param")] (and a generic throw_callback parameter in Coroutine::new if needed)
  • Rust-awaitable PyFuture as a wrapper around the result of __await__, can only be awaited in the context of a future polled by a coroutine and don't support select! (panic otherwise) no nightly required
  • like Python coroutines, PyO3 coroutines delegate send/throw/close call to the currently awaited PyFuture if there is one

I've tested this implementation in a real project, awaiting thousands Python awaitables per second inside PyO3 coroutines running on asyncio, and it works well. By the way the performance difference between awaiting PyFuture and spawning a task like pyo3_asyncio::into_future is quite huge.

My only issue about the default waker implementation for #[pyfunction]. I haven't found a way to make it configurable by the user now, so only asyncio is used for now.

I've also added a lot of tests https://github.com/wyfo/pyo3/blob/async_support/tests/test_coroutine.rs, https://github.com/wyfo/pyo3/blob/async_support/tests/test_awaitable.rs and https://github.com/wyfo/pyo3/blob/async_support/src/gil.rs#L930.

@wyfo
Copy link
Contributor

wyfo commented Nov 23, 2023

@davidhewitt should we keep this issue open until all the planned features are merged?

@alex
Copy link
Contributor

alex commented Nov 29, 2023

Would it make sense to make this functionality option/behind a cargo feature?

The dependency on futures-util adds quite a few dependencies to the tree, which probably impacts compile time, and I think many users probably don't have a need for it.

@davidhewitt
Copy link
Member Author

There was discussion in #3540 (comment) where we decided not to feature gate for the moment, but then we did switch from futures-task to futures-util.

I wouldn't be opposed to having the feature if it was strongly wanted.

@wyfo
Copy link
Contributor

wyfo commented Nov 29, 2023

In fact, future_util is not really useful.
Here are the uses in the implementation (I'm mixing the current state and my last POC):

  • AtomicWaker and poll_fn for cancellation, poll_fn is trivial to reimplement, and AtomicWaker could simply be replaced by a mutex, as there should be no contention issue.
  • FutureExt::catch_unwind and CatchUnwind, again trivial to reimplement
  • ArcWake and waker, just a mistake of me, I didn't remember about std::task::Wake, so they can be simply replaced by the std counterpart

I can remove this dependency in the next PR.

@alex
Copy link
Contributor

alex commented Nov 29, 2023 via email

@awestlake87
Copy link
Contributor

awestlake87 commented Feb 3, 2024

@wyfo, @davidhewitt Apologies for taking so long to catch up on this thread. I've been taking a look at @wyfo's work in pyo3-async, and it's all positive. I think it's a good approach and clearly has performance benefits over pyo3-asyncio, and I think there's a place for both libraries or a combined approach going forward.

pyo3-async is a great candidate for an async library that can be merged into pyo3, and I think it should be the implementation chosen for an async fn for #[pyfunction]. Aside from performance, pyo3-asyncio's implementation is too deeply coupled to the Rust runtimes because of it's use of task locals. These serve an important purpose, but are not necessary for all async functions that can be used with Python.

Just a heads up, this'll be a long response! It's taken awhile for me to gather my thoughts because this is a complicated subject. I felt like I needed to dig into the code to really understand the differences and the synergies for these libraries, and to refresh my memory on our past design decisions

Past Discussions

pyo3-async's design reminds me a lot of @ThibaultLemaire's proof-of-concept for Asyncio-driven futures that we discussed at the start of pyo3-asyncio's development. Reading back through that thread, there were other priorities for pyo3-asyncio at the time, and in hindsight I may have been a bit too dismissive. It seems like now we might have some concrete answers to the concerns that I brought up back then:

  • Are there any real-world examples where people do not need a specific runtime for their future?

Yes! If you're writing Rust code that relies purely on Python's awaitables, then there may not be a need to spawn your future onto a runtime like Tokio. pyo3-async can bridge the gaps between the two languages without the need for a Rust-specific runtime.

In addition, if you need a runtime like tokio or async-std for some Rust functionality, you just need a global reference to the runtime, then you can spawn and await the JoinHandle from the pyo3-async function. This is probably good enough in many cases, but not all cases (I'll explain why).

  • Are there any performance wins with this approach? Would they be on the hot-path or cold-path for most use-cases?

Clearly 😅, sounds like pyo3-async has been blowing pyo3-asyncio out of the water performance-wise. The advantages here were not so clear to us at the time, and we were completely focused on compatibility.

As for the hot-path/cold-path, I'm optimistic, but less certain about that. It really depends on the bindings that users need to create. I think we need more users to start using pyo3-async to see how often we can take advantage of these performance wins.

  • Does this provide better interop between Rust Futures and Python Libraries?

Yes, I think so. Converting these awaitables to Futures that can be polled directly inside Rust and vice-versa has some advantages. The main drawback (as I understand it) is that the Rust Future that's polling an awaitable has to be running on the same thread as the Python event loop in order for some Python functionality like get_running_loop to work.

  • Sharing the asyncio executor could have some advantages that I haven't considered yet.

There was some thought at the time that if Rust futures were polled from asyncio, we could potentially support !Send futures. I'm still not so sure this is practical, but it was completely out of the question with pyo3-asyncio's approach. Could be worth looking into again now that we have pyo3-async.

So can pyo3-async replace pyo3-asyncio completely?

There are still some use-cases that users of pyo3-async might struggle with, and in those cases I think the solution will end up looking very similar to what pyo3-asyncio does. The good news is that you don't need to choose one or the other. pyo3-asyncio and pyo3-async can be used to address completely different problems.

After doing some investigation here, I think the main advantage to pyo3-asyncio's approach is in using async Python from tokio/async-std threads.

As a practical example, imagine we are trying to wrap the Tide library for Python. Tide servers listen for connections, then execute some user-defined async operation when it receives a request. Using this library from Python would probably look something like this:

import asyncio
import rust_http

async def handler():
    await asyncio.sleep(1)
    return "hello"

async def main():
    server = rust_http.Server()

    # Setup two routes for this handler to highlight the differences between
    # the libraries
    server.get_with_pyo3_async('/pyo3_async', handler)
    server.get_with_pyo3_asyncio('/pyo3_asyncio', handler)

    await server.run()

asyncio.run(main())

There's some nuance involved when dealing with event loop references that I've highlighted in pyo3-asyncio's docs. Because we're working with Python's threads and async-std's threads, and Python's coroutines often rely on thread-local storage when using asyncio's library, we need to capture the current event loop and context vars in order for our Python handler to behave correctly.

use pyo3::prelude::*;

/// A Python module implemented in Rust.
#[pymodule]
fn rust_http(_py: Python, m: &PyModule) -> PyResult<()> {
    m.add_class::<Server>()?;
    Ok(())
}

#[pyclass]
struct Server {
    app: tide::Server<()>,
}

#[pyo3_async::pymethods]
impl Server {
    #[new]
    fn new() -> Self {
        Self { app: tide::new() }
    }

    async fn run(this: Py<Self>) -> PyResult<()> {
        let app = Python::with_gil(|py| this.as_ref(py).borrow().app.clone());
        app.listen("127.0.0.1:8080").await.unwrap();
        Ok(())
    }

    fn get_with_pyo3_asyncio(&mut self, path: &str, handler: PyObject) -> PyResult<()> {
        // Capture the locals when this function is called.
        let locals =
            Python::with_gil(|py| pyo3_asyncio::async_std::get_current_locals(py)).unwrap();

        self.app.at(path).get(move |_| {
            let handler = handler.clone();
            let locals = locals.clone();
            async move {
                // Pass the locals we captured earlier into a pyo3-asyncio conversion
                //
                // Internally, the coroutine returned by handler is passed to the event loop
                // using call_soon_threadsafe and wrapped in an asyncio Future.
                //
                // That way if it calls something like asyncio.get_running_loop(), it'll be
                // running on a Python thread.
                //
                // It works!
                let rsp = Python::with_gil(|py| {
                    pyo3_asyncio::into_future_with_locals(
                        &locals,
                        handler.as_ref(py).call0().unwrap(),
                    )
                })
                .unwrap()
                .await
                .unwrap();

                let text: String = Python::with_gil(|py| rsp.extract(py)).unwrap();

                Ok(tide::Response::from(text))
            }
        });

        Ok(())
    }

    fn get_with_pyo3_async(&mut self, path: &str, handler: PyObject) -> PyResult<()> {
        self.app.at(path).get(move |_| {
            let handler = handler.clone();
            async move {
                // Wrap our handler's coroutine in an AwaitableWrapper to drive it from Rust
                //
                // The AwaitableWrapper will drive the coroutine directly using __await__ and
                // __next__. This is a bit problematic in this case the polling is happening on one
                // of async-std's threads.
                //
                // When it runs, we'll get the following error:
                //
                // thread 'async-std/runtime' panicked at src\lib.rs:32:18:
                // called `Result::unwrap()` on an `Err` value: RuntimeError('no running event loop')
                let rsp = Python::with_gil(|py| {
                    pyo3_async::asyncio::AwaitableWrapper::new(handler.as_ref(py).call0().unwrap())
                })
                .unwrap()
                .await
                .unwrap();

                let text: String = Python::with_gil(|py| rsp.extract(py)).unwrap();

                Ok(tide::Response::from(text))
            }
        });

        Ok(())
    }
}

I think in this case, moving the coroutine to asyncio's event loop is inevitable. It needs to run on a Python thread, and we need some way of tracking references to the Event Loop and Contextvars. This is usually done behind the scenes with pyo3_asyncio.

Likewise, things like tokio::spawn don't 'just work' in futures driven from Python's threads. You need some way of finding a reference to Tokio's runtime (often through global storage like pyo3-asyncio). pyo3-asyncio does make this more seamless for users, but at the cost of some performance.

So what now?

The main questions I have now are:

  • How often do cases like this come up? Are they edge cases or common?
  • Is pyo3-asyncio trying too hard to hide details like this?
    • Allowing users to encounter these problems quickly may be the best way to teach them how to address them properly
  • How can we help the community solve problems like this when they inevitably come up?
    • Obviously some new docs / guides will be needed
    • Can we provide some suggestions in the error messages?
    • Can our two libraries be tweaked or merged for better interop?
  • How can we help existing users of pyo3-asyncio make the transition? Or do we even need to?
    • I tried creating wrappers for @wyfo's library before coming to the conclusion that they really are a bit too different.
    • Having an official async fn for the proc_macros in pyo3 itself would be a good start I think
    • The best approach is probably just to use both or trim pyo3-asyncio's API down to just the utilities needed to fill the gaps.

Looking forward to hearing your thoughts on this!

@adamreichold
Copy link
Member

Having an official async fn for the proc_macros in pyo3 itself would be a good start I think

Note that this is underway even if a bit stalled in #3610, #3611, #3612 and #3613.

@XieJiSS
Copy link

XieJiSS commented Feb 8, 2024

How can we help the community solve problems like this when they inevitably come up

IMO clear docs which can be found easily via doc site or via search engine is necessary, and maybe with some additional examples illustrating some common misunderstandings and their recommended solutions?

@wyfo
Copy link
Contributor

wyfo commented Apr 4, 2024

👋 I'm back! Sorry for this quite long period, some health issues kept me away from my keyboard. I'm recovered now and ready to finish the feature.
I've seen that 0.21 has been released – I'm quite excited about it – so I've to catch up on the new features, and I will be able to get back to #3610. There will be a few conflicts to resolve I assume 😅

@wyfo
Copy link
Contributor

wyfo commented Apr 4, 2024

@awestlake87 You're right, pyo3-async and the currently implemented async support in PyO3 are not made to run coroutine outside of the Python thread, and this could indeed be the main job of pyo3-asyncio. Maybe renaming into_future to ensure_future, or even spawn, to be more explicit about the fact it is indeed spawning a task into asyncio event loop, and the returned future doesn't need to be polled (awaited in Rust) to have the task executed.

However, I wonder if future_into_py and other pyo3_asyncio::async_std::main are still a thing with PyO3 async support, especially when pyo3::coroutine::Coroutine will be exposed; in fact, I think Coroutine will completely replace future_into_py.

By the way, AwaitableWrapper will not be added in PyO3, for the reason you spotted. Instead, you can take a look to the PyFuture type which is already planned in #3611.

@wyfo
Copy link
Contributor

wyfo commented Apr 7, 2024

As usual, I like writing some POC on the weekend, and #4057 is my latest one. It comes from my previous comment, about the current async support planed not allowing to await Python awaitables in arbitrary threads. This draft PR provides a PyFuture(PyAny) wrapper around a Python future-like object, either asyncio or concurrent.futures one. It allows converting the PyFuture to a Rust std::future::Future (roughly the same way pyo3-asyncio or pyo3-async do). To handle any awaitable, it requires to call asyncio.ensure_future first, but I think it's better to have it explicit, as it avoids pyo3-asyncio issue. With this, I think the async support in PyO3 will be complete.

I know the name PyFuture echoes with #3611, but as written in a comment, this name was not well suited. I think I have to rename it, but I don't have any idea. I've written a post in Python forum to ask for help with the naming.

@wyfo
Copy link
Contributor

wyfo commented Apr 8, 2024

Maybe the best solution is simply to not expose the type in #3611, but only an async function named await_in_coroutine or something like this. Named type are less and less useful now we have async trait.

Actually, I start liking this await_in_coroutine function more than a type, because it's more explicit about the fact the returned future can only be awaited in a coroutine, and because it can be defined in coroutine module. I will update the PR in consequence. We can still going back to a type if it is preferred.

@davidhewitt
Copy link
Member Author

Welcome back, sorry to hear you were suffering and glad you are better. I see you opened a few PRs already, with apologies I'm a bit busy this week, so I may be slow to review them. I intend to be back at the keyboard as regular soon!

@wyfo
Copy link
Contributor

wyfo commented Apr 17, 2024

Something I didn't thought about is to accept IntoFuture instead of Future in coroutine constructor. ~~into_future would be called in __await__. ~~
Of course it will wait MSRV bump to 1.64, but this is definitely a thing we can/need to implement.
I was too much enthusiastic, but I didn't realized the cost it would have, so bad idea. Once Coroutine constructor will be exposed, it will possible to do by implement __await__ and instantiate coroutine here.

@MasterDingo
Copy link

Hello. I'm not a very experienced Rust developer, I've worked with pyo3 a bit and never with pyo3-asyncio.
I have a compilation error:

error[E0432]: unresolved import `pyo3::coroutine`
 --> src\main.rs:8:11
  |
8 | use pyo3::coroutine::Coroutine;
  |           ^^^^^^^^^ could not find `coroutine` in `pyo3`

These are my dependencies:

[dependencies]
pyo3 = "0.21.2"
tokio = "1.37.0"

Command "cargo add pyo3-asyncio --features tokio" ends up with this:

error: failed to select a version for `pyo3-ffi`.
    ... required by package `pyo3 v0.20.0`
    ... which satisfies dependency `pyo3 = "^0.20"` of package `pyo3-asyncio v0.20.0`
    ... which satisfies dependency `pyo3-asyncio = "^0.20.0"` of package `devita v0.1.0 (F:\work\rust-tests\devita\1\devita)`
versions that meet the requirements `=0.20.0` are: 0.20.0

the package `pyo3-ffi` links to the native library `python`, but it conflicts with a previous package which links to `python` as well:
package `pyo3-ffi v0.21.2`
    ... which satisfies dependency `pyo3-ffi = "=0.21.2"` of package `pyo3 v0.21.2`
    ... which satisfies dependency `pyo3 = "^0.21.2"` of package `devita v0.1.0 (F:\work\rust-tests\devita\1\devita)`
Only one package in the dependency graph may specify the same links value. This helps ensure that only one copy of a native library is linked in the final binary. Try to adjust your dependencies so that only one package uses the `links = "python"` value. For more information, see https://doc.rust-lang.org/cargo/reference/resolver.html#links.

failed to select a version for `pyo3-ffi` which could resolve this conflict

What am I doing wrong?

@wyfo
Copy link
Contributor

wyfo commented Apr 17, 2024

What am I doing wrong?

You need to add experimental-async feature to pyo3 in your Cargo.toml.
Replace pyo3 = "0.21.2" with pyo3 = { version = "0.21.2", features = ["experimental-async"] }

@MasterDingo
Copy link

MasterDingo commented Apr 18, 2024

Thank you very much! That's really what I missed.
But now I have another question. If I embed a Python interpreter in my application, how do I expose Rust async functions (or Futures) into it? Say, as an argument for py.run_bound.

@avp1598
Copy link

avp1598 commented Apr 18, 2024

Getting this error when i try to use async functions which are using some tokio spawn methods

there is no reactor running, must be called from the context of a Tokio 1.x runtime

I am trying to call tokio::task::spawn(future);

how do i configure tokio with this experimental async feature?

@wyfo
Copy link
Contributor

wyfo commented Apr 18, 2024

But now I have another question. If I embed a Python interpreter in my application, how do I expose Rust async functions (or Futures) into it? Say, as an argument for py.run_bound.

Async pyfunctions are just normal pyfunctions (only their return type is converted to coroutine). So you can expose async pyfunction the same way you expose sync pyfunction, using wrap_function! for example.

@wyfo
Copy link
Contributor

wyfo commented Apr 18, 2024

how do i configure tokio with this experimental async feature?

See this example from the original POC. You have to manually declare a runtime, like this:

fn tokio() -> &'static tokio::runtime::Runtime {
    use std::sync::OnceLock;
    static RT: OnceLock<tokio::runtime::Runtime> = OnceLock::new();
    RT.get_or_init(|| tokio::runtime::Runtime::new().unwrap())
}

and you can use it with for example tokio()spawn(...)

@avp1598
Copy link

avp1598 commented Apr 18, 2024

the thing is that i am using a crate which is internally calling tokio::task::spawn(future);

@wyfo
Copy link
Contributor

wyfo commented Apr 18, 2024

@MasterDingo
Copy link

MasterDingo commented Apr 18, 2024

Async pyfunctions are just normal pyfunctions (only their return type is converted to coroutine). So you can expose async pyfunction the same way you expose sync pyfunction, using wrap_function! for example.

Is there any way to create PyCFunction from async block? PyCFunction::new_closure_bound doesn't seem to be able to do this.

@wyfo
Copy link
Contributor

wyfo commented Apr 18, 2024

Is there any way to create PyCFunction from async block? PyCFunction::new_closure_bound doesn't seem to be able to do this.

See #3613. When coroutine constructor will be exposed, you will be able to return a coroutine from any function/closure, hence using PyCFunction::new_closure_bound.
But there is #3610, #3611 #3612 to be merged before, as well as #4057 to complete the async support.

@Edgeworth
Copy link

FWIW I experienced deadlocks when combining pyo3 async support with tonic. I was running blocks that required tokio (calls into tonic) using the OnceLock of a tokio Runtime suggested elsewhere in the thread. Unfortunately it's difficult for me to provide the code that ends up in this deadlock, but the deadlocks went away when I switched my pyo3 functions back to non-async and just used block_on for any awaits. I also never experienced deadlocks using pyo3-asyncio's tokio::run function which I was using previously.

Anyway, I know that isn't anything to go on, and it may be that my code is incorrect somehow, but just want to record this here in case someone else hits the same issue.

@wyfo
Copy link
Contributor

wyfo commented Apr 23, 2024

Actually, I've already experienced deadlock in one of my experiment, see #3540 (comment).
As written in my next comment, the solution was simply to release the GIL in the async function – the feature is already implemented, but is not yet merged, see #3610.
Could you try to build your code with that branch, add #[pyo3(allow_threads)] attribute where needed, and see if the deadlock still occurs?

I think the difference with pyo3-asyncio causing the deadlocks is the fact that Waker implementation needs to acquire the GIL, but it may be called from arbitrary thread with arbitrary locks already acquired.
We may need to better document this risk

@wyfo
Copy link
Contributor

wyfo commented Apr 23, 2024

Anyway, I've discovered today the existence of https://github.com/mozilla/uniffi-rs, and was quite surprised to see it already supports converting a Rust future into a Python awaitable.
I've looked a bit into the code, their way of doing it is a little bit different of the one we chose in PyO3, so it could be interesting to do some comparisons.

@wyfo
Copy link
Contributor

wyfo commented Apr 24, 2024

Quite interestingly, at the beginning, uniffi was not mapping a rust future to a python coroutine but directly to a python future; their waker implementation was scheduling rust future poll on the Python event loop, and setting the python future result when the rust future was ready. Then changed their approach to better handle cancellation and now, this is quite similar to what we do, as they return a coroutine yielding a python future for each rust future poll. The main difference is that they use a generated Python coroutine, while we use a pyclass (without taking account PyO3 additional features like #[pyo3(allow_threads)], await_in_coroutines, manual coroutine instantiation, etc.)

Anyway, I find their first approach quite interesting, and think it should have better performance over creating a new future for each poll. I will maybe try to implement it and run some benchmarks to see – I still don't think it will be worth adding this much complexity in the code, but I like making POC.

@davidhewitt
Copy link
Member Author

Interesting indeed. I think uniffi also uses PyO3 so if we are close to their design they might be pleased to be able to reuse code.

@wyfo I'm keen to review the async PRs again ASAP, I hope to be more available from tomorrow. Which one should I be starting with? I got a bit lost given the time gap; it would indeed be nice to move this further along for 0.22

Also have you seen #4064? I might take a look at that, I guess our protocol methods may need some handling...

@wyfo
Copy link
Contributor

wyfo commented May 6, 2024

There is no mention of PyO3 in uniffi code, so I don't think design convergence matters. By the way, as I said, our design is now quite similar. I'm still interested in their first (incomplete) design, but that's pure curiosity.

The first PR to review is #3610, then #3611, #3612 and #3613. There is also #4057 draft that can be interesting to complete the support. There are conflicts with recent PRs, so I need to resolve them tonight.

I didn't seen #4064. I admit I don't know at all this part of the code (for example, I wasn't able to solve #4113 myself, fortunately @Icxolu was here to help), so I don't know if I will be able to solve the issue myself. I don't really understand why the code is so different between pyfunction/pymethods and magic methods like __anext__, but I assume we should port all the stuff we implemented for async pyfunction (cancellation, allow_threads, receiver special case) there.
I think we should turn #4064 into an issue, and ask for some experienced maintainer to take it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.