Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support reading into user provided buffers #209

Open
gurry opened this issue Jun 9, 2021 · 6 comments
Open

Support reading into user provided buffers #209

gurry opened this issue Jun 9, 2021 · 6 comments

Comments

@gurry
Copy link

gurry commented Jun 9, 2021

Currently read_message() and write_message() work with a Message which carries data in a Vec<u8> or a String. This means that an allocation will happen every time a message is read or written. It is a problem for uses cases that require high performance such as tunnelling a TCP stream over a WebSockets connection. To address such needs, do you think it would be a good idea to have additional methods for reading and writing that take a caller-owned buffer like this:

pub fn write_message(&mut self, buf: &[u8], .... <other arguments like message type etc.>) -> Result<()>
pub fn read_message(&mut self, buf: &mut [u8]) -> Result<(<message type etc.>)>

Then these methods can be used to copy to and from other sources and destinations like a TcpStream without any allocations, roughly like this:

let mut buf = [0;1024];
loop {
    tcp_stream.read(&mut buf, ....);
    web_socket.write_message(buf, ...);
}
@agalakhov
Copy link
Member

This would be fine, but I'm in doubt it will be much faster due to the fragmentation of WebSocket messages. They are copied while assembling. Yes, one memory allocation can be spared but I don't see how zero-copy could be implemented here. I like zero-copy but I wasn't able to implement this in the very first version of Tungstenite. Any ideas how to write this internally?

@gurry
Copy link
Author

gurry commented Jun 10, 2021

Thanks @agalakhov

One way to do zero-copy might be for tungestnite to not attempt assembly of partial messages into a single Message object while receiving or require a full Message object while sending. In the receive API it can simply copy whatever it receives off the wire into the user-provided buf and in the send API send on the wire whatever is in the user-provided buf. Also, in the receive API it can return the type of the message (text/binary/close) and a flag indicating whether the FIN bit was set and in the send API ask the user to provide those details as arguments and set the WS headers accordingly on the outgoing message.

.Net people do something similar as you can see in their send and receive APIs.

This design means that WS message assembly becomes the user's job instead of the library's. In some cases, such as my tunnel example above, assembly is not needed or even desirable because you want to minimize latency (for the tunnel you want to simply shovel whatever you received off the websocket immediately to the TCP socket and vice-versa). So it would be nice to have that flexibility.

Will it be faster? I do think so as it avoids an extra copy and an allocation. In the tunnel-like use-cases not much else is going on in the program other than the data copying operation and hence it cost dominates. Therefore such use-cases should benefit significantly. That said, I could be completely wrong and we'd have to measure it to know for sure.

@agalakhov
Copy link
Member

I see. This is in fact how Tungstenigte works internally, and this could even be achieved if using tungstenite::protocol::frame directly. One still has to assemble messages for ping and close if they are to be handled automatically.

Maybe the best way would be providing a pool of bytes and returning a slice of actually read data, what do you think?

@gurry
Copy link
Author

gurry commented Jun 10, 2021

You're right about ping and close. We may not be able to avoid assembly for them, but they are generally very small, so they'd rarely fragment and therefore it should be okay.

In the second paragraph I presume you're referring to the read API. I don't think we'd need to return a slice out of a pool of bytes provided by the user. The read API can simply take a buffer of type &mut [u8] as an argument, write the incoming data to it and return the following as its return value:

  1. The no. of bytes read from the wire and written to the buffer,
  2. The type of the message
  3. Whether the end of the message has been reached or not.

For example:

enum MesageType {
    Text,
    Binary
   // No need for close and ping since they are automatically handled
}

struct ReadResult {
    bytes_read: usize,
    message_type: MessageType,
    end_of_mesage: bool
}

// This is the signature of the read API
fn read(buf: &mut [u8]) -> Result<ReadResult, Error> {
    // Read data from the underlying websocket by directly passing the user-provided buf to it (which basically means zero-copy)
    // Create the ReadResult object and return it
}

Inside the read API body we don't have to maintain our own buffers. We read from the underlying websocket directly into the user provided buffer and return a ReadResult with the fields appropriately set.

I may be oversimplifying all of this since I don't know the internals of tungstenite as well as you do,, but maybe something like this will work.

@daniel-abramov
Copy link
Member

Just as a note: there was recently a very similar discussion here.

@BlinkyStitt
Copy link

BlinkyStitt commented Jul 28, 2022

I see. This is in fact how Tungstenigte works internally, and this could even be achieved if using tungstenite::protocol::frame directly. One still has to assemble messages for ping and close if they are to be handled automatically.

Is there any example of what this would look like? I'm needing to handle multi-gigabyte messages in my app and it sounds like I need a solution or workaround for this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants