Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

backpressure #7

Open
dominictarr opened this issue Jun 1, 2016 · 20 comments
Open

backpressure #7

dominictarr opened this issue Jun 1, 2016 · 20 comments

Comments

@dominictarr
Copy link
Member

we can have back pressure on the sink side, at least via the bufferedAmount property:

The number of bytes of data that have been queued using calls to send() but not yet transmitted to the network. This value does not reset to zero when the connection is closed; if you keep calling send(), this will continue to climb. Read only

https://developer.mozilla.org/en-US/docs/Web/API/WebSocket

there is no way to send back pressure on the read side though ); surely the implementation has it underneath, but it would be good to provide that to the application... but we should get sink side right at least.

@DamonOehlman
Copy link
Member

Excellent idea. They had a similar property on WebRTC data channels too, but implementations varied as to whether it actually reported anything useful though. Thus I never ended up implementing it, I really should dust off this module though and fix it up.

Might have a go at doing that this weekend at CampJS :)

@dominictarr
Copy link
Member Author

oh cool! make sure you say hi to @ahdinosaur who is gonna be there and has been encouraging me to improve docs etc to make pull-streams more accessable.

@DamonOehlman
Copy link
Member

I will absolutely be saying hi to Mikey :)

@reqshark
Copy link

reqshark commented Jun 1, 2016

there is no way to send back pressure on the read side though ); surely the implementation has it underneath, but it would be good to provide that to the application

@dominictarr exactly, a good stream is characterized by flow moderation at the destination--that's your read side back pressure

@dominictarr
Copy link
Member Author

@reqshark yes but "destination" is relative. writing to a network socket (or websocket) feels seems like the destination, but really it's just the "not my job any more" point. somewhere beyond that stream continues, and it has no way to communicate that it doesn't need more right now.

@dominictarr
Copy link
Member Author

although... you could wrap websockets in a thing that added back pressure in....
and while we are at it pings, to keep the connection alive.

@reqshark
Copy link

reqshark commented Jun 2, 2016

trickling bytes over the wire is a great way to prioritize a user space connection with the kernel scheduler

somewhere beyond that stream continues, and it has no way to communicate that it doesn't need more right now

you can add an additional flag while multiplexing recv() operations.. but you need <sys/socket.h>

@elavoie
Copy link

elavoie commented Nov 29, 2016

What do you guys think of using the 'ping' event for synchronization between the receiver (the stream reading from a source) and the emitter (the stream writing into a sink)? This way rather than having the emitter sends all its data as soon it is ready, it would actually send a single piece of data for each ping received. When both sides are both emitters and receivers, they each send a ping for each read happening on their side.

That way the behavior of a pull-stream is closer to its single process behavior. You also get back-pressure at the application-level because each piece of data needs to be requested explicitly before being sent.

The drawback I see is that you get a ping per element sent over the wire. On some servers, you may also trigger spurious pong responses. I don't know how expensive that is and if that would be acceptable in the context you guys were thinking about for using the library. For these reasons we may make that behavior optional.

I can do a pull-request with the change if you think it is interesting.

@dominictarr
Copy link
Member Author

interesting! didn't know websockets had that. I think it's definitely worth exploring, but keep in mind that this would require special server behavior - where as you might want to use pull-ws with a server you do not control.

If we had a way that you could optionally select backpressure strategies, that would be great, though.

but the big question for me: how do you evaluate back pressure models? how do we know it's working better than using nothing?

@elavoie
Copy link

elavoie commented Nov 29, 2016

After a little bit of experimentation, using 'ping' for flow control requires a bit more complexity on the server to be handled correctly. The 'ping' event on the server is triggered by any websocket connection that sends it, so to use it for flow control requires identifying from which connection the ping originated. The 'ping' event can have a payload that can be used for it, so it is not impossible but it makes the server more complicated. I will figure out whether if it is absolutely required for my application before going down that road.

My personal motivation for the previous design was to have a uniform behavior for pull-stream regardless of whether the entire stream happens in a single process or if it is split between multiple processes with an underlying websocket for connection. In other words, I wanted to be able to transparently put a part of the stream processing on the server without any code changes other than establishing a websocket connection. The communication overhead in my case is irrelevant.

However, because a websocket uses a push-model rather than a pull-model, that makes things a lot more complicated because the receiver on the other end cannot apply back-pressure to the sender. I am still trying to figure out a minimal abstraction that can abstract the network connection while maintaining the regular behavior of pull-streams.

To evaluate back pressure models I see at least these metrics:

  • Latency: How long (higher bound on time or number of messages) does it take for the sender to be notified that the receiver wants to stop the flow?
  • Memory Overhead: What is the higher bound on memory for storing unreceived data?
  • Bandwidth Overhead: What is the % of network traffic that is used for synchronization?
  • Processing Overhead: How many computation steps are required as a function of the number of concurrent connections?

Using these metrics the current pull-ws strategy has:

  • Latency: Infinite, because the sender is never notified that the receiver is overwhelmed
  • Memory Overhead: Proportional to the total number of elements to be sent (therefore it does not work for infinite streams)
  • Bandwidth Overhead: None, if we exclude the synchronization happening in the WebSocket (which I think is itself based on TCP)
  • Processing Overhead: None

The strategy based on 'ping' would have:

  • Latency: Zero, because the sender only sends an element after an explicit request from the receiver
  • Memory Overhead: Zero, because no element needs to be buffered
  • Bandwidth Overhead: One ping per element per half-duplex connection
  • Processing Overhead: Proportional to the number of concurrent connections (each connection needs a listener for the ping event to test whether a ping happened on its own socket)

In practice the current strategy might be ok for most use cases because most streams are finite and fit in memory and there is still flow control happening at the TCP layer. However, it was surprising because its behavior is quite different from what you get in a pull-stream in a single process. When using a websocket, the source is suddenly drained regardless of the speed at which the receiver is processing on the other end.

@dominictarr
Copy link
Member Author

@elavoie you say, of this scheme

Latency: Zero, because the sender only sends an element after an explicit request from the receiver

that is from the perspective of sending the backpressure - but on the other hand, since a item must travel back for every item that travels forward, it means that the bandwidth becomes proportional to the round trip time - which is gonna be slower overall. Ideally, we want a back pressure system that can find the sweet-spot of the maximum throughput

@dominictarr
Copy link
Member Author

I didn't really do much with it, but this may be interesting, @elavoie https://github.com/dominictarr/pull-credit

@elavoie
Copy link

elavoie commented Dec 1, 2016

@dominictarr

Ideally, we want a back pressure system that can find the sweet-spot of the maximum throughput

I agree but I have no idea how to do that.

I am in the process of implementing a variation of the 'ping' scheme purely at the application level using pull-streams, so it should be independent of the underlying transport. The gist of the approach is for the receiver to send an explicit 'ask' for each item it reads. As you pointed out, effectively the delay between each item sent is proportional to the roundtrip time. So in a sense I guess it is the other extreme in the design space, i.e. synchronizing between each element rather than no synchronization at all. Maybe a middle-ground would be to synchronize every X element?

Thanks for the reference, I will take a look at pull-credit.

@elavoie
Copy link

elavoie commented Dec 5, 2016

I published a first version of pull-sync that achieves synchronization using the aforementioned scheme. However, it will probably be terribly slow for any bandwidth-intensive application, so I referred to pull-credit at the end of the README.

I skimmed through the pull-credit repo and read http://250bpm.com/blog:22 and that approach seems much better performance-wise. I will revisit the problem once performance becomes an issue in my application.

@dominictarr Thanks for the link and the awesome work on pull-streams, as a framework for building distributed reuseable components they are quite nice!

@dominictarr
Copy link
Member Author

great! I really want to get some sensible back pressure for muxrpc at some point

@elavoie
Copy link

elavoie commented Dec 9, 2016

pull-limit is another through pull-stream that might be useful for synchronizing on a WebSocket when the other end is used as a loop to delegate processing of values.

It limits the number of elements that can be processed at the same time by waiting for answers before allowing newer elements in. Works well with pull-ws with the default value of n==1. Might require pull-goodbye for n>1 to ensure the socket is closed at the end.

It is similar to pull-sync but does not send messages to the other end, it only keep tracks of the number of elements going in and out.

@dominictarr
Copy link
Member Author

pull-limit shouldn't actually be necessary - correctly implemented pull-stream sink should not call read again until after the previous callback was called. (except for aborting the stream)

@elavoie
Copy link

elavoie commented Dec 9, 2016

My explanation was not clear enough.

The pull-limit implementation provides flow-control around a through pull-stream (ex: limit(pull.through()) ), such that at most N elements are processed concurrently inside it.

It allows the read triggered by the sink to complete only if there are less than N elements currently processing between its sink and its source. Therefore, if there are more than N values being processed, the read will complete only after the source has produced enough values such that there is less than N elements being processed or if the stream is closed.

The current pull-ws implementation works correctly and only reads again after the callback was called. However, the socket drains the source regardless of how fast or slow the rest of the pipeline process elements. This is problematic when the stream starts on side A, is transported through a WebSocket to side B, is being processed on B then brought back to A through the same WebSocket to finish processing. In that case, the middle part of the stream, that is processed on B, is unsynchronized and does not respect flow-control from the point of view of A.

pull-limit bounds the number of elements that go into B at the same time.

I will use the capability to build a fault-tolerant processing platform. If B goes down during processing then at most N elements are being lost and need to be resent elsewhere. And contrary to pull-sync, it requires no special protocol between A and B. B simply process and return elements as soon as they come in.

@dominictarr
Copy link
Member Author

@elavoie got it! that is quite an interesting idea. can you make a PR to add it to https://github.com/pull-stream/pull-stream-docs ?

@elavoie
Copy link

elavoie commented Dec 15, 2016

done here pull-stream/pull-stream-docs#22
I first used the different modules in a working prototype :-).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants