Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

retry producer creation upon error after successful topic lookup #1138

Open
zzzming opened this issue Nov 24, 2023 · 0 comments · May be fixed by #1139
Open

retry producer creation upon error after successful topic lookup #1138

zzzming opened this issue Nov 24, 2023 · 0 comments · May be fixed by #1139

Comments

@zzzming
Copy link
Contributor

zzzming commented Nov 24, 2023

Expected behavior

In the newPartitionProducer() function, there should be a retry of grabCnx(). It will be similar to the reconnectToBroker's grabCnx() retry logic.

Java producer has this retry logic.

Actual behavior

At the producer creation call, after a successful topic lookup at grabCnx() in producer_partition.go, if there is a network issue before the COMMAND to create producer sent, the grabCnx() will exit without retry.

We had frequent failures upon the initial producer creation.

Steps to reproduce

It's tricky to reproduce. But we observe the problem more frequently on Azure pod's initialization stage. After implementing the grabCnx() retry in the newPartitionProducer(), the problem has gone away. (Will do a PR)

System configuration

Pulsar version: 2.10

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant