Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inspect io wrappers #5033

Merged
merged 10 commits into from Sep 28, 2022
Merged

Inspect io wrappers #5033

merged 10 commits into from Sep 28, 2022

Conversation

farnz
Copy link
Contributor

@farnz farnz commented Sep 20, 2022

Motivation

When writing or reading data, it can be useful to inspect the bytes passing through, so that you can (e.g.) hash as you write, or verify hashes of read data concurrently with deserialization.

Wrapping AsyncRead and AsyncWrite successfully so that you can inspect the bytes being handled has a few edge cases to get right, so let's bundle this all up into one place.

Solution

A pair of wrappers that allow you to inspect either the bytes being read, or the bytes that have been written, plus test cases to show that they work correctly even when the underlying implementation does partial writes, or is only able to read in small chunks.

There are use cases like checking hashes of files that benefit from
being able to inspect bytes read as they come in, while still letting
the main code process the bytes as normal (e.g. deserializing into
objects, knowing that if there's a hash failure, you'll discard the
result).

As this is non-trivial to get right (e.g. handling a `buf` that's not
empty when passed to `poll_read`, add a wrapper `InspectReader`
that gets this right, passing all newly read bytes to a supplied `FnMut`
closure.

Fixes: tokio-rs#4584
When writing things out, it's useful to be able to inspect the bytes
that are being written and do things like hash them as they go past.
This isn't trivial to get right, due to partial writes and efficiently
handling vectored writes (if used).

Provide an `InspectWriter` wrapper that gets this right, giving a
supplied `FnMut` closure a chance to inspect the buffers that have been
successfully written out.

Fixes: tokio-rs#4584
@Darksonn Darksonn added A-tokio-util Area: The tokio-util crate M-io Module: tokio/io labels Sep 20, 2022
@farnz farnz mentioned this pull request Sep 20, 2022
Copy link
Contributor

@Darksonn Darksonn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These comments on where bounds also apply to the writer.

tokio-util/src/io/inspect.rs Outdated Show resolved Hide resolved
tokio-util/src/io/inspect.rs Outdated Show resolved Hide resolved
tokio-util/src/io/inspect.rs Outdated Show resolved Hide resolved
@Darksonn
Copy link
Contributor

Sorry about the delay in reviewing this. We had some trouble with our CI setup that took a while to fix.

Comment on lines 83 to 85
if let Poll::Ready(Ok(count)) = res {
(me.f)(&buf[..count]);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if count is zero?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In current code, f gets passed an empty slice, and is expected to handle this - which is fine for hashing and teeing uses,

I would prefer to document this, but can put a guard in place to stop you getting an empty slice if you'd prefer that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems to me that we should avoid empty slices.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do so for both writes and reads - the user should know about EOF on read, or ENOSPC on write from the "main" write or read process, and thus be able to handle that without needing the inspection callback to be called.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Passing an empty slice on EOF read seems ok, but otherwise I would not expect the closure to ever get passed an empty slice.

Comment on lines 107 to 110
for buf in bufs {
let size = count.min(buf.len());
(me.f)(&buf[..size]);
count -= size;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if buf is empty?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then f gets called with an empty slice, since count.min(buf.len()) should be zero, and &buf[..0] is an empty slice.

I only see two cases in which this can happen, and in both cases I think f should be able to handle it (reasoning as above):

  1. The wrapped AsyncWrite genuinely returned a zero byte write. In this case, it is claiming to successfully write zero bytes, which is allowed in the contract (implying that the underlying object can't accept more data ever again).
  2. The caller supplied one or more empty buffers to poll_write_vectored. In this case, you're being told that the supplied empty buffer was indeed written, which is accurate (if not particularly helpful).

I will add a documentation note to warn users that f must cope with empty buffers to both wrappers.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Avoiding empty slices instead.

Copy link
Contributor

@Darksonn Darksonn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

Copy link
Contributor

@Darksonn Darksonn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some doc suggestions.

Comment on lines 24 to 25
/// If no new data is supplied by a successful `poll_read`, then `f` will
/// be called with an empty slice.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// If no new data is supplied by a successful `poll_read`, then `f` will
/// be called with an empty slice.
/// The closure is called with an empty slice only if the inner reader has reached EOF, or if `poll_read` is called with an empty buffer.

Comment on lines 69 to 71
/// `f` will never be called with an empty slice; a vectored write will
/// result in multiple calls to `f`, one for each buffer that was used by
/// the write.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I generally avoid sentences that start with code since we can't make f a capital letter.

Suggested change
/// `f` will never be called with an empty slice; a vectored write will
/// result in multiple calls to `f`, one for each buffer that was used by
/// the write.
/// The closure `f` will never be called with an empty slice. A vectored write can result in multiple calls to `f`.

@Darksonn Darksonn enabled auto-merge (squash) September 28, 2022 17:59
@Darksonn Darksonn merged commit 96fab05 into tokio-rs:master Sep 28, 2022
@farnz farnz deleted the inspect_io_wrappers branch September 28, 2022 19:20
dbischof90 pushed a commit to dbischof90/tokio that referenced this pull request Oct 1, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-tokio-util Area: The tokio-util crate M-io Module: tokio/io
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants