RPC TTFB On RPC Response #5676

AgeManning · 2024-05-01T04:22:35Z

Description

The specs say that when waiting for a response to an RPC request, we should initially wait 5 seconds for the first byte to arrive, (Time to First Byte, TTFB). This will speed up RPC failures for absent peers.

Although this is implemented in establishing streams, we currently only wait the total RPC request timeout when awaiting for a response.

This issue is to terminate requests (with a timeout error) if the first byte does not arrive in FFTB seconds.

Implementation

There are a few things I'd imagine that need to change in order to handle this.

Firstly, the section of code that handles the streams and more specifically a request waiting for a response is here:
https://github.com/sigp/lighthouse/blob/stable/beacon_node/lighthouse_network/src/rpc/handler.rs#L665

When this future is ready, it indicates we have received a new chunk from the RPC request. When we do, if there are more RPC chunks, we increase the RPC timeout window and await a new chunk:
https://github.com/sigp/lighthouse/blob/stable/beacon_node/lighthouse_network/src/rpc/handler.rs#L686

We would have to change this logic to incorporate the first byte for each RPC chunk. A response chunk consists of the following:

response_chunk  ::= <result> | <encoding-dependent-header> | <encoded-payload>

So, we should update our encoder to respond with the first byte, which is the result byte. We should only check this on the first chunk, all remaining chunks can have the usual RPC response timeout.

One way to implement this would be to add a new variant to the RpcCodedResponse:
https://github.com/sigp/lighthouse/blob/stable/beacon_node/lighthouse_network/src/rpc/methods.rs#L422

something like, ResultByte(u8). Then update the encoder to return this variant when the first byte has been read, see here:
https://github.com/sigp/lighthouse/blob/stable/beacon_node/lighthouse_network/src/rpc/codec/base.rs#L147

We could respond with the first byte so that the handler knows the time it has taken. If it takes more than TTFB to get this first byte, then close the stream.

The text was updated successfully, but these errors were encountered:

1010adigupta · 2024-05-10T09:17:41Z

hey @ackintosh can I work with you on this issue, want to delve deeper in the networking stack

AgeManning added the Networking label May 1, 2024

AgeManning assigned ackintosh May 1, 2024

ackintosh mentioned this issue May 13, 2024

RPC TTFB On RPC Response #5769

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RPC TTFB On RPC Response #5676

RPC TTFB On RPC Response #5676

AgeManning commented May 1, 2024

1010adigupta commented May 10, 2024

RPC TTFB On RPC Response #5676

RPC TTFB On RPC Response #5676

Comments

AgeManning commented May 1, 2024

Description

Implementation

1010adigupta commented May 10, 2024