Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RingBuffer exhausts when reading non-aligned chunks of data in bulk #797

Open
JaciBrunning opened this issue Jun 11, 2023 · 1 comment
Open

Comments

@JaciBrunning
Copy link

JaciBrunning commented Jun 11, 2023

Consider the following socket reader, analogous to an asynchronous codec:

let data = socket.recv(|buffer| {
  let len = buffer.len();
  if len < 13 {
    return (0, None)
  }

  (13, Some(buffer[0..13].to_vec()))
});

With the current implementation of RingBuffer, the buffer will exhaust without wrapping around back to the 0th index. Inside of RingBuffer::deque_many_with, we can see that the implementation will attempt to wrap around only if the read data size is aligned with the total capacity of the buffer:

pub fn dequeue_many_with<'b, R, F>(&'b mut self, f: F) -> (usize, R)
where
    F: FnOnce(&'b mut [T]) -> (usize, R),
{
    let capacity = self.capacity();
    let max_size = cmp::min(self.len(), capacity - self.read_at);
    let (size, result) = f(&mut self.storage[self.read_at..self.read_at + max_size]);
    assert!(size <= max_size);
    self.read_at = if capacity > 0 {
        (self.read_at + size) % capacity
    } else {
        0
    };
    self.length -= size;
    (size, result)
}

With a capacity of 256, after the 19th iteration we get to read_at = 247, but max_size is only 9 - meaning the read function only gets 9 bytes to read, returning 0 for its size, failing to adjust read_at and resulting in the buffer exhausting itself even though there is a full frame of usable data in the buffer.

Unfortunately we can't immediately solve this with losing the contiguous property of the buffer being passed to f, putting to onus of keeping track of full data frames on the user which quickly becomes unwieldy and inefficient when multiple sockets are used. I can think of two reasonable solutions:

  • Implement a bip buffer
  • Pass data to the user in at least one but up to two halves, each of which are contiguous. If the current data is wholely contiguous within the circular buffer, give the user (&buffer[start..end], None). If the data is split across the end of the buffer, give the user (&buffer[start..capacity], Some(&buffer[0..end])). This doesn't fully solve the problem, but it avoids the user having to reserve a buffer per-socket for reading data and it can instead be done per-call if required.
@Dirbaio
Copy link
Member

Dirbaio commented Jun 16, 2023

This is by design. The only recv guarantee is "if there's bytes in the buffer, it returns some". Not necessarily all. This is how IO works, std::io::Read works the same way FWIW.

You're supposed to pop the 9 bytes, keep them around, request more bytes, append them to the 9 you got, then when you got a full frame process it.

Even if we changed smoltcp to actually guarantee "it can return all data in a single recv", you still need to handle half-frames in your code because a half frame might actually arrive on the wire due to TCP packet segmentation.

If you use a higher-level wrapper such as embassy-net, the TcpSockets implement embedded-io Read trait, which has read_exact which will do the "assemble received bytes until you got a full frame" for you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

2 participants