Skip to content
This repository has been archived by the owner on Apr 1, 2024. It is now read-only.

ISSUE-13342: [BUG] Producer with name xxx is already connect to topic #3438

Open
sijie opened this issue Dec 15, 2021 · 1 comment
Open

ISSUE-13342: [BUG] Producer with name xxx is already connect to topic #3438

sijie opened this issue Dec 15, 2021 · 1 comment

Comments

@sijie
Copy link
Member

sijie commented Dec 15, 2021

Original Issue: apache#13342


Describe the bug
Master issue: apache#13061 apache/pulsar-client-go#676
Pulsar version:2.8.1

After our investigation, this problem occurs when the ping/pong between the client and the server gradually deviates, until the client senses that the connection is closed, and the connection close operation fails due to network reasons, and the underlying network is not disconnected, resulting in pulsar The broker is still waiting for the ping/pong to time out, but the client has already used the same PartitionProducer, reconnected via the network (changed the port), and started AddProducer to the pulsar broker.

apache#11804, this PR rewrites the equals method of the Producer, resulting in that when different pulsar-client-go uses different port to reconnect, the old producer cannot be removed because the remoteAddress will be verified by equals:

if (producers.remove(producer.getProducerName(), producer)) {

apache#12846, this pr removes equals and will use hashcode for judgment. At this time, the
old producer cannot be removed.

This problem can be closed when the pulsar broker perceives ping/pong timeout, or the channel is abnormal, and the connection can be closed, and the producer state can be cleaned up. When the client AddProducer again, it can be restored; but during this period, the client reconnects and starts the add producer. The broker will always report an error: Producer with name is already connect to topic.

Therefore, I feel that the current protocol cannot fully prove whether the producer client can overwrite itself. It may be necessary to add some fields to prove: I am me

To Reproduce
Steps to reproduce the behavior:

  1. Change broker keepAliveIntervalSeconds=100
  2. You can choose a pulsar client in any language, such as pulsar-client-go or java and other clients
  3. Use the client to send data to the pulsar server
  4. Use a firewall to disconnect the network between the client and the broker. The time is maintained in 60s. After waiting for 60 seconds, close the firewall
  5. Now, you can check the broker log, at this time you can see the error: Producer with name is already connect to topic

Expected behavior
A clear and concise description of what you expected to happen.

@sijie sijie added the type/bug label Dec 15, 2021
@github-actions
Copy link

The issue had no activity for 30 days, mark with Stale label.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

3 participants