New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] When reconnecting to the server, the old connection is not closed #676
Comments
The client does not provide the producer name, it is generated by the pulsar broker and returned to the client. Old connection address: ab.cd.22.174:44352, new connection address: ab.cd.22.174:49750, the same produceName: pulsar_s1-8-346 When a new connection is established and request to add a producer to pulsar broker, the old connection address is still maintaining a relationship with the server, so the pulsar broker will not clear the producer cache related to the old connection |
@wolfstudy @cckellogg @zymap PTAL, thanks. |
I'm trying to understand better what you are seeing. Is below what is happening? 1 - producer-1 with name |
@cckellogg The second point is a bit wrong. Because the client does not provide Because of this, at the first point, the client address is I don't understand the go language, but I roughly read the implementation of pulsar-client-go about connection. Although there is no evidence that the Because you are more familiar with this, can you do some investigation on it? Thanks. |
Are you able to reproduce this consistently? What are those steps? Is the goal of the application to create two producers ( |
@cckellogg We have encountered it twice, each time the client can run normally for 4 to 5 days, but during the period, the client and the server cannot respond in time due to network factors or ping/pong, which leads to the need to disconnect. During the reconnection period, the error that the producer has already connected to topic often occurs. |
Our program can always run normally for 4 to 5 days, but this kind of error always occurs when the network is disconnected and reconnected during the period. Therefore, pulsar-client-go must ensure that the old connection is closed first and then the new connection is established. |
This problem is occurring irregularly in pulsar-client-go.
|
@thinker0 - which version of Pulsar are you using? This issue references using brokers running 2.8.1. A fix for this issue (or a similar one) was released in 2.8.2 and above. |
|
@thinker0 - thanks, also which version of go client are you using? |
|
We are still seeing this issue with Broker 2.10.5 and Go client lib 0.11.0. The problem is more evident when there is network issues. I wonder if reconnectToBroker() can add more wait time when ProducerBusy error is detected. Broker needs time to clear the stale producer state/entry. |
Expected behavior
When reconnecting to the server, the old connection should be closed.
Actual behavior
When reconnecting to the server, the old connection is not closed.
The following log progresses according to the timeline:
First, there is an active channel and producer that is writing data to the broker
In the case that the previous connection has not been disconnected, the client sends an add producer operation for the same producerName
This addition operation comes from the same ip of the client, a different port connection, and the producerName is equal to the previous one
Remove the producer in ServerCnx from the pulsar server, Producer getting producer busy is removing existing producer from list pulsar#11804, this PR modified the equals method of the Producer, resulting in the inability to remove the producer from the topic's prodcuers map.
Then the old connection starts to be disconnected, and the producer status of the server on the connection is cleared
Steps to reproduce
System configuration
Pulsar version: x.y
pulsar-client-go: 0.7.0
pulsar broker: 2.8.1
The text was updated successfully, but these errors were encountered: