Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for io_uring #1591

Open
Noah-Kennedy opened this issue Jul 8, 2022 · 13 comments
Open

Support for io_uring #1591

Noah-Kennedy opened this issue Jul 8, 2022 · 13 comments

Comments

@Noah-Kennedy
Copy link

Adding io_uring support here would make it significantly easier to get io_uring support in tokio.

io_uring supports both readiness-based and completion-based APIs. Readiness-based APIs should be relatively simple. Completion-based support is more complicated.

@Noah-Kennedy
Copy link
Author

@Thomasdezeeuw I think that this is potentially something we need to figure out before we lock in our APIs for 1.0.

@Thomasdezeeuw
Copy link
Collaborator

@Noah-Kennedy I was thinking Mio v1 should remain poll based, i.e. the current implementation. For Mio v2 we would be completion based seeing how it's now supported by both Linux and Windows, hopefully macOS and the BSDs will support something similar (or even better the same API).

@notgull
Copy link

notgull commented Jul 9, 2022

I have to wonder if tokio could use alternate runtimes on a feature flag. As in, by default it uses mio, but if the io_uring feature flag is enabled it replaces it with an io_uring-based runtime?

@Noah-Kennedy
Copy link
Author

One of the key things to bring up around io_uring is that it can actually do both readiness-based and completion-based IO, and there are actually benefits to how io_uring does readiness-based IO over how epoll does it. For this reason I don't think we really need to drop support for poll-based IO in order to support completion-based IO, we can merely find a way to implement completion-based IO as a sort of extension API, which is what I'm trying to think about how to do.

@Noah-Kennedy
Copy link
Author

@notgull I've thought about that and discussed it with others, and it is an option, but I would much rather bake in io_uring support into mio in order to make this whole process somewhat easier to manage.

@Thomasdezeeuw
Copy link
Collaborator

One of the key things to bring up around io_uring is that it can actually do both readiness-based and completion-based IO, and there are actually benefits to how io_uring does readiness-based IO over how epoll does it. For this reason I don't think we really need to drop support for poll-based IO in order to support completion-based IO, we can merely find a way to implement completion-based IO as a sort of extension API, which is what I'm trying to think about how to do.

I'm a little hesitant to support io_uring in v1 because I don't really fancy support both epoll and io_uring implementations at the same time, but dropping epoll is not an option (due to backwards compatibility). Furthermore I don't think io_uring supports all fd types that epoll supports, at least not in earlier versions (maybe it caught up now, I don't know). This means we need to make io_uring either optional/a feature in v1, or we need to do error detecting and falling back to epoll.

@Darksonn
Copy link
Contributor

Darksonn commented Jul 11, 2022

One concern that I have been wondering about is that completion based APIs behave noticably differently when it comes to errors. Any AsyncRead/AsyncWrite implementation would need to immediately report that the write has succeeded, and then return the error on a future write if it failed. For this reason it seems to me that we would need to continue using a readiness-based API indefinitely for Tokio net types such as TcpStream.

I would like to add here, that we also expose explicitly readiness-based APIs such as AsyncFd or the TcpStream::{readable,writeable,try_read,try_write,try_io} methods. Unless it becomes possible to submit "readiness operations" to io_uring, it appears to me that these APIs must necessarily continue to use epoll.

None of this is a concern for Tokio file types as they already have completion-based behavior today.

@Thomasdezeeuw
Copy link
Collaborator

One concern that I have been wondering about is that completion based APIs behave noticably differently when it comes to errors. Any AsyncRead/AsyncWrite implementation would need to immediately report that the write has succeeded, and then return the error on a future write if it failed. For this reason it seems to me that we would need to continue using a readiness-based API indefinitely for Tokio net types such as TcpStream.

I don't think this is really a concern, mainly because AsyncRead/AsyncWrite won't work at all for a completion based design. E.g. in a read call how do we ensure that the buffer (&mut [u8]) stays alive long enough for the OS to write into it, how do we deal with early drops of the Future, etc. I think we'll need a completely new set of traits for completion based I/O.

I would like to add here, that we also expose explicitly readiness-based APIs such as AsyncFd or the TcpStream::{readable,writeable,try_read,try_write,try_io} methods. Unless it becomes possible to submit "readiness operations" to io_uring, it appears to me that these APIs must necessarily continue to use epoll.

None of this is a concern for Tokio file types as they already have completion-based behavior today.

@Noah-Kennedy
Copy link
Author

I'm in agreement with @Thomasdezeeuw regarding the traits.

@Noah-Kennedy
Copy link
Author

Noah-Kennedy commented Jul 11, 2022

@Thomasdezeeuw uring, like epoll, supports any pollable file descriptors for polling with IORING_OP_POLL_ADD.

@Darksonn
Copy link
Contributor

uring, like epoll, supports any pollable file descriptors for polling with IORING_OP_POLL_ADD.

I was not aware of this. In that case, I imagine that you could implement AsyncRead/AsyncWrite by using that to wait for readiness, then perform the actual read with the same non-blocking syscall as we do today. However, this does not seem like it would be an improvement over just continuing to use epoll.

I don't think this is really a concern, mainly because AsyncRead/AsyncWrite won't work at all for a completion based design.

I mean, Tokio uses mio to implement types that implement those traits, so regardless of what mio uses, there needs to be some way to implement the traits using it.

I note that you can use the traits with io_uring if you copy the data into a buffer owned by the IO resource. This wouldn't be good for Tokio's TcpStream, but it would be an improvement to implement Tokio files in that manner.

@Noah-Kennedy
Copy link
Author

@Darksonn my thought with the polling support is that it could be used to have uring replace epoll when a feature or runtime flag is set. For the readiness-based APIs, we could use the polling APIs within uring. For completion-based APIs, we would use the normal, completion-based features of io_uring.

@Thomasdezeeuw
Copy link
Collaborator

I've been working on io_uring in a different repo: https://github.com/Thomasdezeeuw/a10. Maybe it can become a Mio v2, maybe it should separate as it doesn't support anything other than Linux at the moment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants