Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider ILLEGAL_GENERATION error as rebalancing error #1474

Merged
merged 1 commit into from
Feb 27, 2023

Conversation

jakewins
Copy link
Contributor

@jakewins jakewins commented Nov 2, 2022

This error is indicating that the consumer is trying to commit offsets, but the consumer group has changed to a new generation.

Retrying within the existing session will indeed not work, but rejoining the group and re-trying should be successful.

Fixes #1466

@h0od
Copy link

h0od commented Nov 11, 2022

I just encountered the error which causes my consumer to end up in crashed state (which sends shutdown signal to the process).
So as the PR suggests, a simple rejoin should be sufficient in this case.

@rpastore-wolt
Copy link

rpastore-wolt commented Nov 28, 2022

Also we are facing this error. @jakewins do you know if you PR is going to be merged or should be used a different way to handle the error ?

thanks 🙏

@kevinhering
Copy link

We're also watching this one closely. We have quite a number of kafka consumers & we recently had to create custom crash listeners to deal with just this case.

@rpastore-wolt
Copy link

We're also watching this one closely. We have quite a number of kafka consumers & we recently had to create custom crash listeners to deal with just this case.

hey @kevinhering :) thanks !
Could you share with me your solution ?

@ErlendFax
Copy link

Any reason this is not merged?

@Nevon, sorry to tag you. If you have time please consider taking a look at this and #1466

@kevinhering
Copy link

kevinhering commented Dec 12, 2022

@rpastore-wolt I apologize for the delay. We've implemented a crash listener in our consuming code to listen for this particular event (see the docs on Instrumenting Events for some info about event listeners & the crash event). Here's the basic idea:

consumer.on(CRASH, (event) => handleCrashEvent(event)) // listen for crash event

in the crash event handler, look for an event with payload similar to this (pay particular attention to values in error.retriable & error.cause.type):

{
  "topic": "topic-name",
  "error": {
    "name": "KafkaJSNonRetriableError",
    "retriable": false,
    "cause": {
      "name": "KafkaJSProtocolError",
      "retriable": false,
      "type": "ILLEGAL_GENERATION",
      "code": 22
    }
  }
}

You can add whatever code you need in order to alert/restart the consumer from here.

Hope that helps. (and hope we won't need this soon!)

@patrykwegrzyn
Copy link
Contributor

patrykwegrzyn commented Jan 6, 2023

ILLEGAL_GENERATION has become a pain for us recently , any idea will this be merged anytime soon or if is going to be merged at all?

@robinbijlani
Copy link

Also been watching this PR for months, very eager to see it hopefully merged in. 🤞 🤞

@jgoldsmith613
Copy link

We are waiting on this too. Any timeline on a a fix?
This is straight from confluent support
"To reduce impact for both cases above, the client will need to be designed to retry its connection to Kafka if the ILLEGAL_GENERATION exception is encountered. "

thedustinsmith added a commit to freightview/kafkajs that referenced this pull request Jan 18, 2023
@josh-cain
Copy link

We're hitting this too, and it's causing a bit of toil. Would really appreciate a merge 🙏 !

@ecalvert
Copy link

We're hitting this as well. Causing major issues when confluent cloud performs their cluster rolls. We see consumer crashes every 5-10 minutes for a 2 hour span

@srMarquinho
Copy link

Just checking this issue is active and being looked after. This is exactly our problem and it is absolutely major to us. Thank you all. 🙌

This error is indicating that the consumer is trying to commit
offsets, but the consumer group has changed to a new generation.

Retrying within the existing session will indeed not work, but
rejoining the group and re-trying should be successful.

Fixes tulios#1009
@Nevon Nevon merged commit ae87309 into tulios:master Feb 27, 2023
@Nevon
Copy link
Collaborator

Nevon commented Mar 1, 2023

This bug has been resolved in v2.2.4. Thanks @jakewins for the fix!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

"Specified group generation id is not valid" after broker maintenance, consumer stops receiving events