Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Discovery
This managed to avoid being hit because the test cases were running acks=0 last and never sending any more messages down those connections.
But while investigating #1588 this started showing up in the acks0 test case presumably because the java driver was reusing the connections it sent the acks=0 produce requests down.
Background info
Codecs are an optional helper mechanism provided by the
tokio_util
crate.They make it easy to attach an implementation of a protocol to a TCP socket.
In shotover every database that we support has a unique codec implemented for its protocol.
There is no single
Codec
type, instead there are separate read and write codecs, called the Encoder and the Decoder.The kafka protocol does not specify the message type of the responses sent by the kafka broker.
Instead it is up to the client to remember the message type of each request it sends out and then match up each response that comes in with its corresponding request so that the response can be correctly decoded.
In order to decode kafka responses, shotover needs the kafka outgoing encoder to instruct the outgoing decoder on how to decode each response.
This is done via an mpsc channel, specifically we use the mpsc channel from the standard library.
shotover-proxy/shotover/src/codec/kafka.rs
Lines 42 to 48 in c1331f8
The problem
However there is one exception to this that we were not handling!
The kafka produce request can set a field named
acks
to determine how many acks the leader broker must receive before responding to the request.As a special case if
acks = 0
then the leader broker will never respond to that request.This is used if latency is the highest concern and lost messages are acceptable.
Shotover will never receive a response for a produce with acks=0.
So if the encoder tells the decoder to expect a produce response it will expect the next message to be a produce even though that is not the case. This results in attempting to decode responses as the wrong type and therefore failure to decode.
The fix
The correct fix here is to have the encoder skip sending message type information if
acks = 0
.Specifically that is done by checking the result of the
Message::response_is_dummy()
method which checks if the request will not receive a response, including a check foracks = 0
on kafka messages.Now that the issue is fixed we can enable the failing tests in
kafka_int_tests/test_cases.rs
.