Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Vectored Writes #3135

Closed
seanmonstar opened this issue Nov 12, 2020 · 10 comments
Closed

RFC: Vectored Writes #3135

seanmonstar opened this issue Nov 12, 2020 · 10 comments
Labels
A-tokio Area: The main tokio crate C-proposal Category: a proposal and request for comments M-io Module: tokio/io M-net Module: tokio/net

Comments

@seanmonstar
Copy link
Member

This provides a proposal for providing generic vectored writes in Tokio.

Background

writev

Vectored writes are the ability to write multiple, non-contiguous buffers as a single "atomic" write. This helps reduce syscalls and memory copies, since data to be written can frequently be in different buffers.

A simple example is writing a length-delimited protocol where the length goes before the message. Once the message contents are put into a buffer, an encoder needs to write the length to the transport first. You could do that with 2 write calls, one for the length, and then one for the message. Or you could copy both the length and the message into a single buffer, to only have 1 write call, but that gets more expensive the bigger the message is. Vectored writes solves both issues:

let len_buf = encode_len(msg_buf.len());

transport.write_vectored(&[len_buf, msg_buf]).await?;

Previous Design

trait AsyncWrite {
    fn poll_write_buf<B: Buf>(
        self: Pin<&mut Self>,
        cx: &mut task::Context<'_>,
        buf: &mut B,
    ) -> Poll<io::Result<usize>> {
        // default to `poll_write`
    }
}

Problems

  • Implementors may forget to provide poll_write_buf even if they could
  • The default poll_write_buf is much worse than just merging buffers and using poll_write
  • This results in the need for a "dynamic write buf queue", which tries to detect if writev is used, and if it isn't, converts the buf queue into a contiguous buffer. Requiring protocol implementors to remember to do this is a big gotcha.
  • poll_write_buf took a generic, B: Buf, which made it force dyn AsyncWrite trait objects to always use the default "slow" method, instead of being able to forward on.

Options

There seem to be two main options for providing a generic vectored write solution. This proposal explores both of them, and then provides a recommendation afterwards.

1. As Part of AsyncWrite

The way the standard library supports this for blocking writes is with additional methods on Write. This option follows that example:

trait AsyncWrite {
    // existing methods ...
    // fn poll_write()

    /// Do a non-blocking vectored write.
    ///
    /// A default implementation exists so that implementors that do not
    /// have vectored write support do not need to repeat this boilerplate.
    fn poll_write_vectored(
        self: Pin<&mut Self>,
        cx: &mut task::Context<'_>,
        bufs: &[IoSlice<'_>],
    ) -> Poll<io::Result<usize>> {
        self.poll_write(cx, &bufs[0])
    }

    /// Detect if this type generally supports vectored writes.
    ///
    /// By default, this returns false, and implementors can override to return
    /// true if they also override `poll_write_vectored`.
    fn is_write_vectored(&self) -> bool {
        false
    }
}

Transports that do no support vectored writes do not need to do anything, as there is a default implementation that forwards to poll_write. To deal with the problem of the default forwarding being very slow, an additional is_write_vectored method can be checked. A user can ask an AsyncWrite if it supports vectored writes, and if not, consider using a single flat buffer and only calling poll_write.

This doesn't solve the issue of implementors forgetting to override the default method when creating wrappers. The suggested fix here is just to pay better attention 🤷.

2. A Separate AsyncVectoredWrite trait

It has been suggested that since not all transports can support vectored writes, only those that do should implement a separate trait.

trait AsyncVectoredWrite {
    fn poll_write_vectored(
        self: Pin<&mut Self>,
        cx: &mut task::Context<'_>,
        bufs: &[IoSlice<'_>],
    ) -> Poll<io::Result<usize>>;
}

At first glance, this feels really nice. We can get static verification of the capabilities of a transport. There are some parallels to Iterator, and how ExactSizeIterator can be indicate a type that knows its exact size, or how DoubleEndedIterator can also iterate from the back.

However, to be really elegant, this would option would require trait specialization. This is similar to how the standard library can be generic over Iterator, and optimize/specialize when the type also implements ExactSizeIterator.

Specialization

Transports that could implement AsyncVectoredWrite likely could also implement AsyncWrite. Libraries that want to be generic over a transport would currently need to decide either to only support AsyncWrite, or require all transports to implement AsyncVectoredWrite.

In some (distant) future, we'd ideally be able to write something like this:

impl Http<T: AsyncWrite> {
    pub fn new(transport: T) -> Self;
}

// internally
fn write_spec<T: SpecAsyncWrite, B: Buf>(transport: T, buf: B) {}

impl<T: AsyncWrite> SpecAsyncWrite for T {
    // normal writes
}

impl<T: AsyncVectoredWrite> SpecAsyncWrite for T {
    // writev, vroom!
}

Alas, that is not currently stable, and won't be for some time.

Dynamic Support

Besides the lack of specialization, there are also occassions when a transport is a composite type that doesn't know if it supports vectored writes at compilation time. Consider the following example:

pub enum MaybeTlsStream {
    /// TcpStreams have writev support.
    PlainText(TcpStream),
    /// Many TLS implementations can't make good use of vectored writes,
    /// since they need wrap bytes in a TLS record and encrypt them.
    Encrypted(TlsStream<TcpStream>),
}

If the library author wanted to use this type in a place that required AsyncVectoredWrite, they could decide to implement AsyncVectoredWrite in a fashion that forwards to writev on the plaintext variant, and does the "slow" default similar to the libstd option. But that has the same problem as the original design.

BufWriter

For now, a library author that would want to try to use vectored writes would need to require T: AsyncVectoredWrite, and users that have transports that don't implement it could pass it into a simple BufWriter wrapper. This would likely be a better option than forwarding just 1 buffer at a time.

pub struct BufWriter<T: AsyncWrite> {
    inner: T,
    buf: Cursor<Vec<u8>>,
}

impl<T: AsyncWrite> AsyncVectoredWrite for BufWriter<T> {
    // In `poll_write_vectored`, all the buffers would be copied into
    // `self.buf`, and then it would be flushed in `AsyncWrite::poll_flush`.
}

Applying this to the MaybeTlsStream above, a library author could extend it like so:

pub enum MaybeTlsStream {
    PlainText(TcpStream),
    Encrypted(BufWriter<TlsStream<TcpStream>>),
}

A downside to this approach is that combining more and more composite types can easily end up using multiple layers of BufWriter, especially when some layers are also trying to be generic over either AsyncWrite or AsyncVectoredWrite.

Recommendation

This proposal recommends option 1, adding methods to AsyncWrite.

  • This option is easier for library authors to be generic over transports, versus supporting both AsyncWrite and AsyncVectoredWrite.
  • With is_vectored_write, composite types can forward onto their inner types, and thus intermediate buffering layers are not required.
  • Trait specialization would make the separate trait very enticing, but with its stabilization likely years away, we cannot rely on it now.
    • When specialization is stable, we could consider creating the new trait anyways, to allow for static detection.

Appendix

Argument Type

There is a question of what the exact argument type for poll_write_vectored should be: &[IoSlice], &mut impl Buf, or &mut dyn Buf.

  • &[IoSlice]: This matches what is used in std::io::Write.
  • &mut impl Buf: Passing a generic Buf is more convenient, but one of the problems with the original solution is that generic methods cannot be implemented on trait objects (so Box<dyn AsyncWrite>).
  • &mut dyn Buf: This fixes the problem with trait objects, and would be more convenient than &[IoSlice], but the cost of the dynamic dispatch on Buf methods concerns some people. Whether the cost is actually noteworthy is not known (simple experiments in hyper did not notice anything).

To deviate the least from the standard library, we propose using &[IoSlice]. It would be easy for us to add an additional poll_write_buf(&mut dyn Buf) method that fills up a stack [IoSlice], reducing boilerplate.

@seanmonstar seanmonstar added C-proposal Category: a proposal and request for comments A-tokio Area: The main tokio crate labels Nov 12, 2020
@Matthias247
Copy link
Contributor

Matthias247 commented Nov 12, 2020

+1 for adding just a vectored write method to AsyncWrite.

Regarding the argument type, I would heavily lean towards either &[IoSlice] or a more boring &[&[u8]].
The latter is easier to deal with in pure Rust code, since initiating IoSlice is a bit painful. One can translate between both with a fairly minor cost (having a stack array of size N and copying slice information over) - which however limits the write size at the translation layer.

impl Buf is out for me because it prevents having trait objects. dyn Buf seems over-engineered for what I would typically do with it. It also requires a dependency on bytes, and makes it less likely to have something common with a future standard library trait.

I'm not sold too much on the benefit of is_vectored_write. If it's reporting wrong, it still might not be more efficient to aggregate first. The backend might still not be doing a syscall, and getting separate calls of smaller slices might be OK.
I don't think there is a great solution for fixing the "I forgot about vectored IO issue" besides maybe just dropping the non vectored version fully. Best bet seems currently to be to audit the most crucial code paths (e.g. HTTP library -> TLS library -> IO driver) to make sure everything supports vectored IO.

But I'm also not really opposed for having the is_vectored_write method. I guess there could be a question for implementors of composed streams on what to return? Should they return their own support, or also take the support of layers they wrap into account?

@carllerche
Copy link
Member

Given that a poll_write_vectored() default implementation is strictly worse than calling poll_write(), should the default implementation panic?

@seanmonstar
Copy link
Member Author

That'd make it much easier for a composite type that may have forgotten to override the default in an infrequent path could crash a server, instead of occasionally just being a little slower. And the strategy that hyper uses can still be used to try to detect the best solution dynamically.

@carllerche
Copy link
Member

"A little slower" is going to be very hard to debug. The default impl should never be used. This is why I wanted to explore the separate trait option.

@seanmonstar
Copy link
Member Author

I still think grumpy debugging is better than crashing a server and killing thousands of connections. I could see the argument for making it a debug-assert level panic, to help when testing, but once shipped to production, don't blow up.

@hawkw
Copy link
Member

hawkw commented Nov 12, 2020

It could also log a warning in release mode?

@carllerche
Copy link
Member

I would be very scared if someone shipped code to production that did not test the write path...

@carllerche
Copy link
Member

A debug panic is probably fine though.

seanmonstar added a commit that referenced this issue Nov 17, 2020
This adds `AsyncWrite::poll_write_vectored`, and implements it for
`TcpStream` and `UnixStream`. Based on the proposal in #3135.
@seanmonstar
Copy link
Member Author

PR is up: #3149

carllerche pushed a commit that referenced this issue Nov 18, 2020
This adds `AsyncWrite::poll_write_vectored`, and implements it for
`TcpStream` and `UnixStream`.

Refs: #3135.
@hawkw
Copy link
Member

hawkw commented Nov 21, 2020

I believe that merging #3149 should have closed this issue?

@hawkw hawkw closed this as completed Nov 21, 2020
@Darksonn Darksonn added M-net Module: tokio/net M-io Module: tokio/io labels Nov 22, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-tokio Area: The main tokio crate C-proposal Category: a proposal and request for comments M-io Module: tokio/io M-net Module: tokio/net
Projects
None yet
Development

No branches or pull requests

5 participants