[improve][client] Add maxConnectionsPerHost and connectionMaxIdleSeconds to PulsarAdminBuilder #22541

lhotari · 2024-04-19T13:17:19Z

Fixes #22041

Motivation

See #22041 . Currently, when using the asynchronous interfaces of the Pulsar Admin client, there's no backpressure by the client itself and the client will keep on opening new connections to the broker to fulfill the in-progress requests.
Eventually, the broker will hit the maxHttpServerConnections limit, which is 2048.

It's better to limit the number of connections from a single client. This PR sets the limit to 16 connections per host.
The limit isn't called connectionsPerBroker since admin operations usually target a cluster address.

Modification

add maxConnectionsPerHost and connectionMaxIdleSeconds to PulsarAdminBuilder
also modify the default connectionMaxIdleSeconds from 60 seconds to 25 seconds
- some firewalls/NATs have a timeout of 30 seconds and that's why 25 seconds is a better default - common firewall/NAT idle timeout is 60 seconds and since the check isn't absolute, a better default is 25 seconds to ensure that connections don't die because of firewall/NAT timeouts

Documentation

doc
doc-required
doc-not-needed
doc-complete

… to PulsarAdminBuilder - also modify the default connectionMaxIdleSeconds from 60 seconds to 25 seconds - some firewalls/NATs have a timeout of 30 seconds and that's why 25 seconds is a better default - common firewall/NAT idle timeout is 60 seconds and since the check isn't absolute, a better default is 25 seconds to ensure that connections don't die because of firewall/NAT timeouts

merlimat

LGTM, just a couple of minor suggestions

pulsar-client-admin-api/src/main/java/org/apache/pulsar/client/admin/PulsarAdminBuilder.java

merlimat · 2024-04-20T00:11:54Z

...ient-admin/src/main/java/org/apache/pulsar/client/admin/internal/PulsarAdminBuilderImpl.java

@@ -47,6 +47,7 @@ public PulsarAdmin build() throws PulsarClientException {

    public PulsarAdminBuilderImpl() {
        this.conf = new ClientConfigurationData();
+        this.conf.setConnectionsPerBroker(16);


Couldn't the default be part of ClientConfigurationData constructor?

Couldn't the default be part of ClientConfigurationData constructor?

Not really. ClientConfigurationData is designed for PulsarClient, but it's also used in PulsarAdmin client. The current PulsarAdminBuilderImpl is a bit of a hack around ClientConfigurationData.

lhotari · 2024-04-22T17:07:12Z

the setMaxConnectionsPerHost in async http client doesn't seem to behave as expected. Will check the errors

  Caused by: java.util.concurrent.CompletionException: org.apache.pulsar.client.admin.internal.http.AsyncHttpConnector$RetryException: Could not complete the operation. Number of retries has been exhausted. Failed reason: Too many connections: 16
  	at java.base/java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:332)
  	at java.base/java.util.concurrent.CompletableFuture.uniApplyNow(CompletableFuture.java:674)
  	at java.base/java.util.concurrent.CompletableFuture.orApplyStage(CompletableFuture.java:1601)
  	at java.base/java.util.concurrent.CompletableFuture.applyToEither(CompletableFuture.java:2261)
  	at org.apache.pulsar.client.admin.internal.http.AsyncHttpConnector.retryOrTimeOut(AsyncHttpConnector.java:275)
  	at org.apache.pulsar.client.admin.internal.http.AsyncHttpConnector.apply(AsyncHttpConnector.java:234)

UPDATE: it's necessary to set acquireFreeChannelTimeout setting in AHC. Will find a way to set a proper default and have it configurable.

lhotari · 2024-04-23T06:57:08Z

There's a problem with backpressure handling with async requests in the Pulsar code base.
Since this PR limits the Pulsar Admin client to 16 connections per host, it now shows up problems.

The namespace unloading is a good example:

pulsar/pulsar-broker/src/main/java/org/apache/pulsar/broker/admin/impl/NamespacesBase.java

Lines 829 to 841 in d7d5452

    
           final List<CompletableFuture<Void>> futures = new ArrayList<>(); 
        
           List<String> boundaries = policies.bundles.getBoundaries(); 
        
           for (int i = 0; i < boundaries.size() - 1; i++) { 
        
               String bundle = String.format("%s_%s", boundaries.get(i), boundaries.get(i + 1)); 
        
               try { 
        
                   futures.add(pulsar().getAdminClient().namespaces().unloadNamespaceBundleAsync( 
        
                           namespaceName.toString(), bundle)); 
        
               } catch (PulsarServerException e) { 
        
                   log.error("[{}] Failed to unload namespace {}", clientAppId(), namespaceName, e); 
        
                   throw new RestException(e); 
        
               } 
        
           } 
        
           return FutureUtil.waitForAll(futures);

All bundles in the namespace are unloaded at once without limiting concurrency.
There was a dev mailing list discussion about backpressure and Pulsar Admin API implementation in https://lists.apache.org/thread/03w6x9zsgx11mqcp5m4k4n27cyqmp271 . However we didn't come across resolving the problem.

a snippet from my email in that thread:

Touching upon Pulsar Admin REST API. The context for back pressure is a
lot different in the Pulsar Admin REST API. Before the PIP-149 async
changes, there was explicit backpressure in the REST API implementation.
The system could handle a limited amount of work and it would process
downstream work items one-by-one.

With "PIP-149 Making the REST Admin API fully async"
(#14365), there are different
challenges related to backpressure. It is usually about how to limit the
in-progress work in the system. An async system will accept a lot of
work compared to the previous solution and this accepted work will get
processed in the async REST API backend eventually even when the clients
have already closed the connection and sent a new retry. One possible
solution to this issue is to limit incoming requests at the HTTP server
level with features that Jetty provides for limiting concurrency. PRs
#14353 and
#15637 added this support to
Pulsar. The values might have to be tuned to a lot lower values to
prevent issues in practice. This is not a complete solution for REST API
backend. It would be useful to also have a solution that would cancel
down stream requests that are for incoming HTTP requests that no longer
exist since the client stopped waiting for the response. The main down
stream requests are towards the metadata store. It might also be
necessary to limit the number of outstanding downstream requests. With
batching in metadata store, that might not be an issue.

The solution for the namespace unloading issue is to have a way to limit the outstanding CompletableFutures that are in progress and use that as a way to "backpressure" the sending of new requests. The current solution of sending out all requests and then waiting for the results is a problematic solution since it doesn't use any sort of feedback from the system to adjust the speed. In other words, there's currently no proper backpressure solution for async Pulsar Admin calls within Pulsar broker.

I'll experiment with some ways to add backpressure to cases where a large amount of async calls are triggered and then results are waited.

lhotari · 2024-04-23T07:31:56Z

Another location without proper backpressure is namespace deletion:

pulsar/pulsar-broker/src/main/java/org/apache/pulsar/broker/admin/impl/NamespacesBase.java

Lines 282 to 293 in d7d5452

    
           return markDeleteFuture.thenCompose(__ -> 
        
                           internalDeleteTopicsAsync(allUserCreatedTopics)) 
        
                   .thenCompose(ignore -> 
        
                           internalDeletePartitionedTopicsAsync(allUserCreatedPartitionTopics)) 
        
                   .thenCompose(ignore -> 
        
                           internalDeleteTopicsAsync(allSystemTopics)) 
        
                   .thenCompose(ignore -> 
        
                           internalDeletePartitionedTopicsAsync(allPartitionedSystemTopics)) 
        
                   .thenCompose(ignore -> 
        
                           internalDeleteTopicsAsync(topicPolicy)) 
        
                   .thenCompose(ignore -> 
        
                           internalDeletePartitionedTopicsAsync(partitionedTopicPolicy));

lhotari · 2024-04-23T07:58:52Z

example of creating partitions:

pulsar/pulsar-broker/src/main/java/org/apache/pulsar/broker/admin/AdminResource.java

Lines 162 to 171 in 50121e7

    
           protected CompletableFuture<Void> tryCreatePartitionsAsync(int numPartitions) { 
        
               if (!topicName.isPersistent()) { 
        
                   return CompletableFuture.completedFuture(null); 
        
               } 
        
               List<CompletableFuture<Void>> futures = new ArrayList<>(numPartitions); 
        
               for (int i = 0; i < numPartitions; i++) { 
        
                   futures.add(tryCreatePartitionAsync(i)); 
        
               } 
        
               return FutureUtil.waitForAll(futures); 
        
           }

This would need backpressure too. Let's say if you create a 100 partition topic, the broker might open 100 HTTP connections to create the topic partitions concurrently. This is problematic when the brokers are under heavy load.

lhotari · 2024-04-23T08:10:46Z

Looks like https://github.com/spotify/completable-futures/blob/master/src/main/java/com/spotify/futures/ConcurrencyReducer.java could be a useful solution to leverage.

lhotari · 2024-04-23T09:19:31Z

Noticed that there's a solution to run 1-by-1 using

pulsar/pulsar-common/src/main/java/org/apache/pulsar/common/util/FutureUtil.java

Lines 179 to 210 in ed59967

    
               @ThreadSafe 
        
               public static class Sequencer<T> { 
        
                   private CompletableFuture<T> sequencerFuture = CompletableFuture.completedFuture(null); 
        
                   private final boolean allowExceptionBreakChain; 
        
                   public Sequencer(boolean allowExceptionBreakChain) { 
        
                       this.allowExceptionBreakChain = allowExceptionBreakChain; 
        
                   } 
        
                   public static <T> Sequencer<T> create(boolean allowExceptionBreakChain) { 
        
                       return new Sequencer<>(allowExceptionBreakChain); 
        
                   } 
        
                   public static <T> Sequencer<T> create() { 
        
                       return new Sequencer<>(false); 
        
                   } 
        
                   /** 
        
                    * @throws NullPointerException NPE when param is null 
        
                    */ 
        
                   public synchronized CompletableFuture<T> sequential(Supplier<CompletableFuture<T>> newTask) { 
        
                       Objects.requireNonNull(newTask); 
        
                       if (sequencerFuture.isDone()) { 
        
                           if (sequencerFuture.isCompletedExceptionally() && allowExceptionBreakChain) { 
        
                               return sequencerFuture; 
        
                           } 
        
                           return sequencerFuture = newTask.get(); 
        
                       } 
        
                       return sequencerFuture = allowExceptionBreakChain 
        
                               ? sequencerFuture.thenCompose(__ -> newTask.get()) 
        
                               : sequencerFuture.exceptionally(ex -> null).thenCompose(__ -> newTask.get()); 
        
                   } 
        
               }

. However, I think that ConcurrencyReducer would be a better solution for most use cases.

lhotari · 2024-04-23T09:50:14Z

Another challenge is to cancel work that is queued in the system, but not waited by any clients.
Newer Jersey clients have support for this. I noticed commit eclipse-ee4j/jersey@9602806 in Jersey.
When the system is overloaded, request processing might be very slow so that clients get timeouts and retry requests.
This will add more work to the system unless there's a solution that cancels the timed out tasks. That's why addressing this is also important part of the solution.

lhotari added this to the 3.3.0 milestone Apr 19, 2024

lhotari requested a review from poorbarcode April 19, 2024 13:17

lhotari self-assigned this Apr 19, 2024

lhotari added the ready-to-test label Apr 19, 2024

lhotari marked this pull request as draft April 19, 2024 13:17

github-actions bot added the doc Your PR contains doc changes, no matter whether the changes are in markdown or code files. label Apr 19, 2024

merlimat approved these changes Apr 20, 2024

View reviewed changes

lhotari added 2 commits April 22, 2024 10:04

Rename connectionsPerHost -> maxConnectionsPerHost

100766d

Add test for specifying maxConnectionsPerHost with loadConf

5cf187a

lhotari changed the title ~~[improve][client] Add connectionsPerHost and connectionMaxIdleSeconds to PulsarAdminBuilder~~ [improve][client] Add maxConnectionsPerHost and connectionMaxIdleSeconds to PulsarAdminBuilder Apr 22, 2024

merlimat approved these changes Apr 22, 2024

View reviewed changes

merlimat marked this pull request as ready for review April 22, 2024 15:09

Add connectionAcquireTimeout configuration

eada07a

dao-jun approved these changes Apr 23, 2024

View reviewed changes

lhotari mentioned this pull request Apr 23, 2024

Cleanup synchronous call in resources component #22544

Open

2 tasks

This was referenced Apr 25, 2024

[improve][ci] Disable test that causes OOME until the problem has been resolved #22586

Merged

[improve][broker]Ensure namespace deletion doesn't fail #22627

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[improve][client] Add maxConnectionsPerHost and connectionMaxIdleSeconds to PulsarAdminBuilder #22541

[improve][client] Add maxConnectionsPerHost and connectionMaxIdleSeconds to PulsarAdminBuilder #22541

lhotari commented Apr 19, 2024 •

edited

merlimat left a comment

merlimat Apr 20, 2024

lhotari Apr 22, 2024

lhotari commented Apr 22, 2024 •

edited

lhotari commented Apr 23, 2024 •

edited

lhotari commented Apr 23, 2024

lhotari commented Apr 23, 2024

lhotari commented Apr 23, 2024

lhotari commented Apr 23, 2024

lhotari commented Apr 23, 2024

[improve][client] Add maxConnectionsPerHost and connectionMaxIdleSeconds to PulsarAdminBuilder #22541

Are you sure you want to change the base?

[improve][client] Add maxConnectionsPerHost and connectionMaxIdleSeconds to PulsarAdminBuilder #22541

Conversation

lhotari commented Apr 19, 2024 • edited

Motivation

Modification

Documentation

merlimat left a comment

Choose a reason for hiding this comment

merlimat Apr 20, 2024

Choose a reason for hiding this comment

lhotari Apr 22, 2024

Choose a reason for hiding this comment

lhotari commented Apr 22, 2024 • edited

lhotari commented Apr 23, 2024 • edited

lhotari commented Apr 23, 2024

lhotari commented Apr 23, 2024

lhotari commented Apr 23, 2024

lhotari commented Apr 23, 2024

lhotari commented Apr 23, 2024

lhotari commented Apr 19, 2024 •

edited

lhotari commented Apr 22, 2024 •

edited

lhotari commented Apr 23, 2024 •

edited