Long running consumer will not give up partition ownership when new consumer instance comes online #22666
Labels
Client
This issue points to a problem in the data-plane of the library.
customer-reported
Issues that are reported by GitHub users external to the Azure organization.
Event Hubs
needs-author-feedback
More information is needed from author to address the issue.
no-recent-activity
There has been no recent activity on this issue.
question
The issue doesn't require a change to the product in order to be resolved. Most issues start as that
Bug Report
I'm using azeventhubs to build a Benthos benthos.dev consumer plugin with checkpointing following the example here: example_consuming_with_checkpoints_test.go
In testing with a simple event hub containing 2 partitions, running 2 instances fails to properly load balance when the first benthos consumer input processor has been running for a moderate amount of time.
For example, start consumer 1 and wait ~60 seconds. It should now be processing both partitions. Then start consumer 2. Consumer 1 will continue processing both partitions while consumer 2 will process a single partition, resulting in duplicate events.
However, if both consumers are started around the same time then load balancing seems to operate as normal. I can start/stop either one and they will drop partitions or assume them normally. The problem only occurs once a single consumer has been working on both partitions for some longer time.
I've tried both the normal balanced and greedy types of load balanced settings.
No matter how long a consumer runs, I expect to be able to add or remove additional benthos consumers and see consistent load balancing.
Try to run a normal consumer outside of benthos and see if results are as expected.
I'm not sure if it's because it's running as Benthos plugin, but all else behaves fine except for the balancing.
The text was updated successfully, but these errors were encountered: