Efficient implementation of vectored writes for BufWriter #3163

mzabaluev · 2020-11-21T22:55:30Z

Motivation

With vectored writes reintroduced in #3149, BufWriter can provide an efficient implementation both with or without vectored write support in the underlying writer, and advertise it with is_write_vectored.

Solution

Provide a custom implementation of AsyncWrite::poll_write_vectored by coalescing data from the slices in the buffer, unless it is more efficient to pass one or all of them directly to the underlying writer. This needs to be optimized differently depending on whether the underlying writer supports efficient vectored output.

carllerche · 2020-12-05T17:06:08Z

Thanks for the submission. Could we add some tests for the new functions?

mzabaluev · 2020-12-05T18:34:54Z

I had some tests written for #3092, but these relied on an async-friendly write_vectored method I implemented there, so I've been waiting for an equivalent to be reintroduced in tokio or tokio-util. #3156 only added poll_write_buf. I can make do with a temporary helper function, but would prefer to use AsyncWriteExt::write_buf once it's updated to use vectored output.

mzabaluev · 2020-12-16T15:51:23Z

I have rebased the branch and added some tests.

Darksonn

I'm sorry for taking so long to review this. The PR got lost among all the other PRs we get.

tokio/src/io/util/buf_writer.rs

Darksonn · 2021-05-05T14:09:50Z

tokio/src/io/util/buf_writer.rs

+    fn poll_write_vectored(
+        mut self: Pin<&mut Self>,
+        cx: &mut Context<'_>,
+        bufs: &[IoSlice<'_>],


In general, I think that this could be simplified quite a lot if you make use of the fact that it is possible to reassign bufs to a sub-slice of itself to remove slices from the beginning of the buffer. E.g. to remove the first slice after having used it:

bufs = &bufs[1..];

Thank you for the suggestion. I could not simplify by much, but hopefully the code is more readable now without an explicit iterator.

I gave it a try myself and came up with this. What do you think?

} else { // Remove empty buffers at the beginning of bufs. while bufs.first().map(|buf| buf.len()) == Some(0) { bufs = &bufs[1..]; } if bufs.is_empty() { return Poll::Ready(Ok(0)); } // Flush if the first buffer doesn't fit. let first_len = bufs[0].len(); if first_len >= self.buf.capacity() - self.buf.len() { ready!(self.as_mut().flush_buf(cx))?; debug_assert!(self.buf.is_empty()); } let me = self.project(); if first_len >= me.buf.capacity() { // The slice is at least as large as the buffering capacity, // so it's better to write it directly, bypassing the buffer. debug_assert!(me.buf.is_empty()); return me.inner.poll_write(cx, &*bufs[0]); } // Append the buffers that fit in the internal buffer. // The first buffer is guaranteed to fit due to the flush above. let mut total_written = 0; for buf in bufs { if buf.len() > me.buf.capacity() - me.buf.len() { break; } else { me.buf.extend_from_slice(&*buf); total_written += buf.len(); } } debug_assert!(total_written != 0); Poll::Ready(Ok(total_written)) }

@Darksonn Thanks! I have unrolled the first iteration of appending though, since we're into improving performance and the check in the loop body would be done redundantly, and there is a suitable else branch for that.

Implement AsyncWrite::poll_write_vectored for BufWriter, making use of the buffer to coalesce data from the passed slices. Make exceptions for cases when writing directly to the underlying object is more efficient, which differ depending on whether the underlying writer itself supports vectored output. Change the implementation of AsyncWrite::is_write_vectored to return true.

Forward poll_write_vectored/is_write_vectored to the underlying object for BufReader. Do the same in BufStream, which has the effect of is_write_vectored returning true because BufWriter's implementation now does.

Simplify branching and do what poll_write does: do not buffer slices partially, optimizing for the most likely case where slices are much smaller than the buffer, while retaining special treatment for oversized slices.

In the implementation of AsyncWrite::poll_write_vectored for BufWriter, the total length of the data can technically overflow usize even with safely obtained buffer slices, since slices may overlap.

Improve readability by updating the IOSlice vector slice as constituent slices are gone over by the iteration, rather than using an iterator.

@Darksonn

Partially incorporated code suggested by @Darksonn.

tokio/src/io/util/buf_writer.rs

Co-authored-by: Alice Ryhl <alice@ryhl.io>

Darksonn

This seems good now, thanks.

taiki-e added A-tokio Area: The main tokio crate C-enhancement Category: A PR with an enhancement or bugfix. M-io Module: tokio/io labels Nov 22, 2020

mzabaluev force-pushed the vectored-write-for-buf-writer branch from d34f468 to c2aee65 Compare December 16, 2020 15:50

mzabaluev force-pushed the vectored-write-for-buf-writer branch from c2aee65 to f0bf96c Compare January 4, 2021 10:42

Darksonn reviewed May 5, 2021

View reviewed changes

mzabaluev added 6 commits June 29, 2021 00:21

io: vectored output for BufReader and BufStream

f805b69

Forward poll_write_vectored/is_write_vectored to the underlying object for BufReader. Do the same in BufStream, which has the effect of is_write_vectored returning true because BufWriter's implementation now does.

io: simplify poll_write_vectored for BufWriter

ab30c42

Simplify branching and do what poll_write does: do not buffer slices partially, optimizing for the most likely case where slices are much smaller than the buffer, while retaining special treatment for oversized slices.

test write_vectored on BufWriter

32f8a8d

io: guard against integer overflow in BufWriter

30a6591

In the implementation of AsyncWrite::poll_write_vectored for BufWriter, the total length of the data can technically overflow usize even with safely obtained buffer slices, since slices may overlap.

Rewrite non-vectored branch in poll_write_vectored

f0d443d

Improve readability by updating the IOSlice vector slice as constituent slices are gone over by the iteration, rather than using an iterator.

mzabaluev force-pushed the vectored-write-for-buf-writer branch from f0bf96c to f0d443d Compare June 29, 2021 08:54

More readability fixes in poll_write_vectored

2dff0ad

Partially incorporated code suggested by @Darksonn.

mzabaluev force-pushed the vectored-write-for-buf-writer branch from 18b991e to 2dff0ad Compare June 29, 2021 14:51

Darksonn reviewed Jun 29, 2021

View reviewed changes

tokio/src/io/util/buf_writer.rs Show resolved Hide resolved

BufWriter: assert the buffer is empty when bypassing

598560b

Co-authored-by: Alice Ryhl <alice@ryhl.io>

mzabaluev force-pushed the vectored-write-for-buf-writer branch from eddf256 to 598560b Compare June 29, 2021 17:07

Darksonn approved these changes Jun 29, 2021

View reviewed changes

Darksonn merged commit 0531549 into tokio-rs:master Jun 29, 2021

Darksonn mentioned this pull request Jul 2, 2021

Prepare Tokio v1.8.0 #3912

Merged

migrax mentioned this pull request Jul 6, 2021

Bump tokio from 1.5.0 to 1.8.0 RedesdeOrdenadores/ShuffleRouter#38

Closed

softdevca mentioned this pull request Jul 7, 2021

Bump tokio from 1.7.1 to 1.8.1 softdevca/mootranscode#24

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Efficient implementation of vectored writes for BufWriter #3163

Efficient implementation of vectored writes for BufWriter #3163

mzabaluev commented Nov 21, 2020

carllerche commented Dec 5, 2020

mzabaluev commented Dec 5, 2020

mzabaluev commented Dec 16, 2020

Darksonn left a comment

Darksonn May 5, 2021 •

edited

mzabaluev Jun 29, 2021

Darksonn Jun 29, 2021 •

edited

mzabaluev Jun 29, 2021

Darksonn left a comment

Efficient implementation of vectored writes for BufWriter #3163

Efficient implementation of vectored writes for BufWriter #3163

Conversation

mzabaluev commented Nov 21, 2020

Motivation

Solution

carllerche commented Dec 5, 2020

mzabaluev commented Dec 5, 2020

mzabaluev commented Dec 16, 2020

Darksonn left a comment

Choose a reason for hiding this comment

Darksonn May 5, 2021 • edited

Choose a reason for hiding this comment

mzabaluev Jun 29, 2021

Choose a reason for hiding this comment

Darksonn Jun 29, 2021 • edited

Choose a reason for hiding this comment

mzabaluev Jun 29, 2021

Choose a reason for hiding this comment

Darksonn left a comment

Choose a reason for hiding this comment

Darksonn May 5, 2021 •

edited

Darksonn Jun 29, 2021 •

edited