Skip to content
This repository has been archived by the owner on Jun 4, 2021. It is now read-only.

Messages dropped when receiver adapter is temporarily down #1494

Closed
itsmurugappan opened this issue Aug 20, 2020 · 13 comments
Closed

Messages dropped when receiver adapter is temporarily down #1494

itsmurugappan opened this issue Aug 20, 2020 · 13 comments
Labels
kind/bug Categorizes issue or PR as related to a bug.
Milestone

Comments

@itsmurugappan
Copy link
Contributor

Describe the bug
Messages are dropped when receiver adapter is temporarily down.

Expected behavior
messages should not be dropped when receiver adapter is offline

To Reproduce
Create Kafka source, produce messages in a loop. Simulate a temporary failure to bounce the receiver adapter, you would notice that some messages are dropped.

Knative release version
0.16

@itsmurugappan itsmurugappan added the kind/bug Categorizes issue or PR as related to a bug. label Aug 20, 2020
@aliok
Copy link
Member

aliok commented Aug 24, 2020

@slinkydeveloper may have an idea alredy

@slinkydeveloper
Copy link
Contributor

When you recreate the source, do you always set up the same consumer group?

@itsmurugappan
Copy link
Contributor Author

yes, same consumer group. Infact the kafkasource is not deleted, just the adapter pods are bounced.

@slinkydeveloper
Copy link
Contributor

slinkydeveloper commented Sep 1, 2020

Ok, can you try to reproduce the issue and while you're reproducing the issue, consume the topic where the consumer group offset is being written? I wonder if, for some reason, we start from the wrong offset (the newest and not the one on the topic) or something wrong is being written in the offset topic.
Look at that to check how to read the consumer group offset topic: https://stackoverflow.com/questions/33925866/kafka-how-to-read-from-consumer-offsets-topic

@itsmurugappan
Copy link
Contributor Author

itsmurugappan commented Sep 2, 2020

Not sure if i did this correctly

messages from the actual topic. I posted from numbers 1 to 50, you can see numbers missing from 25-38 and 40-44, thats when i bounced the adapter pod

"Got an Event: Validation: valid\nContext Attributes,\n specversion: 1.0\n type: dev.knative.kafka.event\n source: /apis/v1/namespaces/test-alpha/kafkasources/track-offset-kafka-source#track-offset-topic\n subject: partition:0#23\n id: partition:0/offset:23\n time: 2020-09-02T04:57:21.275Z\nExtensions,\n traceparent: 00-5581f578895584f996f95ebe2a02af0a-ab69a0ba8f4dbc1f-00\nData,\n 24\n"
"Got an Event: Validation: valid\nContext Attributes,\n specversion: 1.0\n type: dev.knative.kafka.event\n source: /apis/v1/namespaces/test-alpha/kafkasources/track-offset-kafka-source#track-offset-topic\n subject: partition:0#24\n id: partition:0/offset:24\n time: 2020-09-02T04:58:22.937Z\nExtensions,\n traceparent: 00-a0763d69e0b41976e710fbb1cbdfee39-55ab7e4dc97f8da3-00\nData,\n 25\n"
"Got an Event: Validation: valid\nContext Attributes,\n specversion: 1.0\n type: dev.knative.kafka.event\n source: /apis/v1/namespaces/test-alpha/kafkasources/track-offset-kafka-source#track-offset-topic\n subject: partition:0#37\n id: partition:0/offset:37\n time: 2020-09-02T04:58:42.985Z\nExtensions,\n traceparent: 00-2356485b44c7e77311e755c9b3635568-50126189846808c7-00\nData,\n 38\n"
"Got an Event: Validation: valid\nContext Attributes,\n specversion: 1.0\n type: dev.knative.kafka.event\n source: /apis/v1/namespaces/test-alpha/kafkasources/track-offset-kafka-source#track-offset-topic\n subject: partition:0#38\n id: partition:0/offset:38\n time: 2020-09-02T04:58:44.077Z\nExtensions,\n traceparent: 00-73bbbc46f7ef5778a76cba6290dfb2ca-723befa586107b48-00\nData,\n 39\n"
"Got an Event: Validation: valid\nContext Attributes,\n specversion: 1.0\n type: dev.knative.kafka.event\n source: /apis/v1/namespaces/test-alpha/kafkasources/track-offset-kafka-source#track-offset-topic\n subject: partition:0#39\n id: partition:0/offset:39\n time: 2020-09-02T04:58:46.506Z\nExtensions,\n traceparent: 00-3a39f24cbdf88f811d1a237534de748b-94647dc288b8edc9-00\nData,\n 40\n"
"Got an Event: Validation: valid\nContext Attributes,\n specversion: 1.0\n type: dev.knative.kafka.event\n source: /apis/v1/namespaces/test-alpha/kafkasources/track-offset-kafka-source#track-offset-topic\n subject: partition:0#43\n id: partition:0/offset:43\n time: 2020-09-02T05:01:13.713Z\nExtensions,\n traceparent: 00-c20cc6e17e7a90729c6fd3d1e7aa597f-e735b86db00e2646-00\nData,\n 44\n"
"Got an Event: Validation: valid\nContext Attributes,\n specversion: 1.0\n type: dev.knative.kafka.event\n source: /apis/v1/namespaces/test-alpha/kafkasources/track-offset-kafka-source#track-offset-topic\n subject: partition:0#44\n id: partition:0/offset:44\n time: 2020-09-02T05:01:15.278Z\nExtensions,\n traceparent: 00-51bdb9e09bcafb886ba7a300c4ee8a06-5969606607984953-00\nData,\n 45\n"
"Got an Event: Validation: valid\nContext Attributes,\n specversion: 1.0\n type: dev.knative.kafka.event\n source: /apis/v1/namespaces/test-alpha/kafkasources/track-offset-kafka-source#track-offset-topic\n subject: partition:0#45\n id: partition:0/offset:45\n time: 2020-09-02T05:01:18.114Z\nExtensions,\n traceparent: 00-d6f9d186c766f3d729c314302f5744ff-cb9c085f5e216d60-00\nData,\n 46\n"
"Got an Event: Validation: valid\nContext Attributes,\n specversion: 1.0\n type: dev.knative.kafka.event\n source: /apis/v1/namespaces/test-alpha/kafkasources/track-offset-kafka-source#track-offset-topic\n subject: partition:0#46\n id: partition:0/offset:46\n time: 2020-09-02T05:01:23.21Z\nExtensions,\n traceparent: 00-94ff66cbd43450a2f320b49c584e6298-3dd0b057b5aa906d-00\nData,\n 47\n"
"Got an Event: Validation: valid\nContext Attributes,\n specversion: 1.0\n type: dev.knative.kafka.event\n source: /apis/v1/namespaces/test-alpha/kafkasources/track-offset-kafka-source#track-offset-topic\n subject: partition:0#47\n id: partition:0/offset:47\n time: 2020-09-02T05:01:25.561Z\nExtensions,\n traceparent: 00-b0181fa778e42fb095c725631a0538ee-af0359500c34b47a-00\nData,\n 48\n"
"Got an Event: Validation: valid\nContext Attributes,\n specversion: 1.0\n type: dev.knative.kafka.event\n source: /apis/v1/namespaces/test-alpha/kafkasources/track-offset-kafka-source#track-offset-topic\n subject: partition:0#48\n id: partition:0/offset:48\n time: 2020-09-02T05:01:26.843Z\nExtensions,\n traceparent: 00-c4e6f46c03ac798c2def24b04fb5ea97-2137014963bdd787-00\nData,\n 49\n"
"Got an Event: Validation: valid\nContext Attributes,\n specversion: 1.0\n type: dev.knative.kafka.event\n source: /apis/v1/namespaces/test-alpha/kafkasources/track-offset-kafka-source#track-offset-topic\n subject: partition:0#49\n id: partition:0/offset:49\n time: 2020-09-02T05:01:27.608Z\nExtensions,\n traceparent: 00-85e017ae200f023edb9907841c682230-936aa941ba46fb94-00\nData,\n 50\n"

consumer offset topic details

consumer����tM."�
% Reached end of topic __consumer_offsets [29] at offset 4
consumerrange+sarama-2d410a75-e5c5-4c57-aad0-270674152afftM.o�+sarama-2d410a75-e5c5-4c57-aad0-270674152aff��sarama
                                                                                                                  /10.244.0.0�`'track-offset-topic����&track-offset-topic����
% Reached end of topic __consumer_offsets [29] at offset 5
consumer����tM0��
% Reached end of topic __consumer_offsets [29] at offset 6
consumerrange+sarama-7d3531dc-b4d4-4abb-959f-fadccc335d95tM0�s+sarama-7d3531dc-b4d4-4abb-959f-fadccc335d95��sarama
                                                                                                                  /10.244.0.0�`'track-offset-topic����&track-offset-topic����
% Reached end of topic __consumer_offsets [29] at offset 7
$ bin/kafka-run-class.sh kafka.admin.ConsumerGroupCommand --bootstrap-server  --group muru-track-offset  --describe

GROUP             TOPIC              PARTITION  CURRENT-OFFSET  LOG-END-OFFSET  LAG             CONSUMER-ID                                 HOST            CLIENT-ID
muru-track-offset track-offset-topic 0          -               50              -               sarama-7d3531dc-b4d4-4abb-959f-fadccc335d95 /10.244.0.0     sarama

@itsmurugappan
Copy link
Contributor Author

itsmurugappan commented Sep 2, 2020

@slinkydeveloper as you can see above current offset is always empty.

According to this issue, we have to upgrade to sarama 1.27.0 and do session.Commit() after marking the message.

Kindly review and let me know your thoughts so I can send a PR for this.

I tested the above changes and it seems to work fine.

@slinkydeveloper
Copy link
Contributor

slinkydeveloper commented Sep 7, 2020

Let me try to bump the sarama version @itsmurugappan

@slinkydeveloper
Copy link
Contributor

It seems like we're already on v1.27: https://github.com/knative/eventing-contrib/blob/master/go.mod#L6 so do you still see this problem with master of this repo?

@slinkydeveloper
Copy link
Contributor

This was updated by #1510

@itsmurugappan
Copy link
Contributor Author

yay!, with the update, messages are not dropped when receiver adapter crashes. Thank you.
Just to confirm, offsets are committed only when the adapter crashes. Is this the expected behaviour? For example, in the below scenario 5 messages were not marked immediately but on a crash they get synced up. Adding manual commit after marking the message updates the offset immediately, do we need to add that ?

$ bin/kafka-consumer-groups.sh --describe --group muru-track-offset-0906-1 --bootstrap-server kafka:9092

GROUP                    TOPIC                     PARTITION  CURRENT-OFFSET  LOG-END-OFFSET  LAG             CONSUMER-ID                                 HOST            CLIENT-ID
muru-track-offset-0906-1 track-offset-topic-0906-1 0          -               5               -               sarama-b7f69f57-71d1-4b6a-888f-4430aca0e812 /10.244.0.0     sarama

$ bin/kafka-consumer-groups.sh --describe --group muru-track-offset-0906-1 --bootstrap-server kafka:9092

GROUP                    TOPIC                     PARTITION  CURRENT-OFFSET  LOG-END-OFFSET  LAG             CONSUMER-ID     HOST            CLIENT-ID
muru-track-offset-0906-1 track-offset-topic-0906-1 0          6               6               0               -               -               -

@slinkydeveloper
Copy link
Contributor

TBH i prefer to leave the actual sarama configuration, committing on every single message is expensive. I guess that sarama commits offsets at a predefined interval or at predefined batch size

@slinkydeveloper slinkydeveloper added this to the v0.18.0 milestone Sep 7, 2020
@slinkydeveloper
Copy link
Contributor

Since on master this seems to be fixed, I close this issue as solved. @itsmurugappan please let me know if you need something else

@itsmurugappan
Copy link
Contributor Author

makes sense , thank you @slinkydeveloper

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
kind/bug Categorizes issue or PR as related to a bug.
Projects
None yet
Development

No branches or pull requests

3 participants