Skip to content
This repository has been archived by the owner on Apr 1, 2024. It is now read-only.

ISSUE-15885: [Discuss] Pulsar client : Connect Command add keep_alive_interval #4312

Open
sijie opened this issue Jun 2, 2022 · 0 comments

Comments

@sijie
Copy link
Member

sijie commented Jun 2, 2022

Original Issue: apache#15885


Motivation

When investigating apache#13342, we found that both the client and the server have the keepAliveIntervalSeconds configuration, which is 30s by default. During the configured time, the channel will send ping/pong commands to maintain connection availability. If the pong command is not replied within the cycle, the channel will be closed. For the client side, the reconnect logic is triggered after the channel is closed. For the broker side, the broker will clear the producer information after the channel is inactive.
For the problem of apache#13342, it is because the user changed the configuration on the broker side to 100s. When the client determines that the connection has timed out and needs to disconnect the channel, since the client to the broker passes through the firewall, the close of the channel may not be sent, and then the client reconnects to the broker, and the reconnection succeeds. However, the timeout setting of the broker is relatively large. If the previous channel is not closed, the producer information will not be cleared. The reconnection of the producer will cause the broker to throw the exception that the producer already exists.
This is the cause of the apache#13342 issue, and by tweaking the code, the issue can be reproduced.

What I want to discuss is whether we can optimize this, configure this value only on the client-side, and pass it to the broker through the connect command. The advantage is that the server can cancel this configuration, using client-side value instead, and multiple clients can configure different values.

API Changes

Add keep_alive_interval in CommandConnect:

message CommandConnect {
     ...
     optional int32 keep_alive_interval = 11 [default = 30];
}

The original logic to check the keep-alive is in PulsarHandler#handleKeepAliveTimeout which begins at channel active.

For broker-side:

  • Now we do this in ServerCnx#completeConnect

For client-side:

  • Now we do this in ClientCnx#handleConnected

Compatibility

no compatibility issues

@sijie sijie added the PIP label Jun 2, 2022
@sijie sijie changed the title ISSUE-15885: PIP-163: Pulsar client : Connect Command add keep_alive_interval ISSUE-15885: [Discuss] Pulsar client : Connect Command add keep_alive_interval Jul 5, 2022
@sijie sijie added the Stale label Jul 5, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

1 participant