Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Long running consumer will not give up partition ownership when new consumer instance comes online #22666

Closed
dhasek00 opened this issue Apr 2, 2024 · 4 comments
Assignees
Labels
Client This issue points to a problem in the data-plane of the library. customer-reported Issues that are reported by GitHub users external to the Azure organization. Event Hubs needs-author-feedback More information is needed from author to address the issue. no-recent-activity There has been no recent activity on this issue. question The issue doesn't require a change to the product in order to be resolved. Most issues start as that

Comments

@dhasek00
Copy link

dhasek00 commented Apr 2, 2024

Bug Report

  • Import Path: github.com/Azure/azure-sdk-for-go/sdk/messaging/azeventhubs
  • SDK version: v1.0.3
  • Go version: 1.20.2 and 1.22.1
  • Benthos version: 4.25.1
  • What happened?

I'm using azeventhubs to build a Benthos benthos.dev consumer plugin with checkpointing following the example here: example_consuming_with_checkpoints_test.go
In testing with a simple event hub containing 2 partitions, running 2 instances fails to properly load balance when the first benthos consumer input processor has been running for a moderate amount of time.

For example, start consumer 1 and wait ~60 seconds. It should now be processing both partitions. Then start consumer 2. Consumer 1 will continue processing both partitions while consumer 2 will process a single partition, resulting in duplicate events.

However, if both consumers are started around the same time then load balancing seems to operate as normal. I can start/stop either one and they will drop partitions or assume them normally. The problem only occurs once a single consumer has been working on both partitions for some longer time.

I've tried both the normal balanced and greedy types of load balanced settings.

  • What did you expect or want to happen?

No matter how long a consumer runs, I expect to be able to add or remove additional benthos consumers and see consistent load balancing.

  • How can we reproduce it?

Try to run a normal consumer outside of benthos and see if results are as expected.
I'm not sure if it's because it's running as Benthos plugin, but all else behaves fine except for the balancing.

  • Anything we should know about your environment.
@github-actions github-actions bot added Client This issue points to a problem in the data-plane of the library. customer-reported Issues that are reported by GitHub users external to the Azure organization. Event Hubs needs-team-attention This issue needs attention from Azure service team or SDK team question The issue doesn't require a change to the product in order to be resolved. Most issues start as that Service Attention This issue is responsible by Azure service team. labels Apr 2, 2024
Copy link

github-actions bot commented Apr 2, 2024

Thanks for the feedback! We are routing this to the appropriate team for follow-up. cc @jfggdl.

@jhendrixMSFT jhendrixMSFT removed the Service Attention This issue is responsible by Azure service team. label Apr 8, 2024
@github-actions github-actions bot added the needs-team-triage This issue needs the team to triage. label Apr 8, 2024
@jhendrixMSFT jhendrixMSFT removed the needs-team-triage This issue needs the team to triage. label Apr 8, 2024
@richardpark-msft
Copy link
Member

Hi @dhasek00, I've tried running this same test a few different ways and everything appears to be working correctly.

One thing I'm curious about is if it's possible that the two Benthos instances are NOT using the same Azure Storage container. If they were not, it's possible to see some of the behavior you describe where the two partition processors act as if they are unaware of each other.

@richardpark-msft richardpark-msft added the needs-author-feedback More information is needed from author to address the issue. label Apr 16, 2024
Copy link

Hi @dhasek00. Thank you for opening this issue and giving us the opportunity to assist. To help our team better understand your issue and the details of your scenario please provide a response to the question asked above or the information requested above. This will help us more accurately address your issue.

@github-actions github-actions bot removed the needs-team-attention This issue needs attention from Azure service team or SDK team label Apr 16, 2024
Copy link

Hi @dhasek00, we're sending this friendly reminder because we haven't heard back from you in 7 days. We need more information about this issue to help address it. Please be sure to give us your input. If we don't hear back from you within 14 days of this comment the issue will be automatically closed. Thank you!

@github-actions github-actions bot added the no-recent-activity There has been no recent activity on this issue. label Apr 23, 2024
@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale May 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Client This issue points to a problem in the data-plane of the library. customer-reported Issues that are reported by GitHub users external to the Azure organization. Event Hubs needs-author-feedback More information is needed from author to address the issue. no-recent-activity There has been no recent activity on this issue. question The issue doesn't require a change to the product in order to be resolved. Most issues start as that
Projects
None yet
Development

No branches or pull requests

3 participants