fix #3569: removing all api references to customResource #4148

shawkins · 2022-05-15T15:33:34Z

Description

Addresses #3569 by removing all api references to CustomResource

It also further minimizing SharedInformerFactoryImpl concerns and removes the use of the executor - before I add the migration guide for this, we should have consensus on the approach:

Options include:

Leave it up to the user to user to handle their own threading concerns wrt Watchers and ResourceEventHandlers - the main pain point with this change is for users of the SharedInformerFactory who were relying on the async nature of the associated executor, but that is not consistent with the other apis (creating a watch or Informable informer). There has also been at least one issue where it seems that there is confusion over the role of the factory executor.
Provide wrappers or convenience methods to create async versions of the Watchers and ResourceEventHandlers - e.g. Watcher.async(Watcher watcher, Executor executor). This means we don't have to introduce additional dsl methods and it's very flexible as to what executor is used. We do have the common executor via Utils.getCommonExecutor - but we really shouldn't be promoting static usage. Another goal of the static refactoring to scope the common executor per client, if that change were worked, you could have have client methods available as well - client.configMaps().watch(client.async(watcher)).
Make it annotation driven - if the watcher or handler is annotated with Async (spring has a similar annotation) then use the common pool (which as mentioned above will eventually be configurable per client), otherwise call it directly. This tries to encapsulate things more than 2.

Any other thoughts?

Type of change

Bug fix (non-breaking change which fixes an issue)
Feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to change
Chore (non-breaking change which doesn't affect codebase;
test, version modification, documentation, etc.)

Checklist

Code contributed by me aligns with current project license: Apache 2.0
I Added CHANGELOG entry regarding this change
I have implemented unit tests to cover my changes
I have added/updated the javadocs and other documentation accordingly
No new bugs, code smells, etc. in SonarCloud report
I tested my code in Kubernetes
I tested my code in OpenShift

doc/FAQ.md

metacosm

LGTM apart from minor wording issues.

metacosm · 2022-05-16T08:05:40Z

Regarding dealing with async, I haven't thought too deep about it but right now, I'd favor approach 2 which seems to provide the most flexibility.

shawkins · 2022-05-17T17:30:00Z

I apologize that there are really two things being merged together here - trying to clarify the Executor usage turned out to be more involved than I would have hoped.

@metacosm @manusa the next set of changes remove the usage of the static Utils.getCommonExecutor and allow for an executor or executorservice to be scoped to the client via the KubernetesClientBuilder. This can be a different executor than the one used for IO operations because this is for potentially blocking calls (although the default logic for creating the IO thread pools for OkHttp or the JDK client default to a cached / unbounded thread pool, which could be used for both).

This client scoped executor will be used rather than allowing for an ExecutorService per SharedInformerFactory - which we didn't document well nor provide any examples of. It is also no longer needed to start the informers, since we have a non-blocking way of doing that - it's real purpose is to decouple the ResourceEventHandler method calls from the IO threads. For consistency we'll use the same executor under informers created by the Informable dsl. When using the OkHttp client other than a few corner cases it was ok to use the IO thread to process events (that's why the DSL was using Runnable::run) because the IO threads are managed per websocket. However with the JDK client that can be a problem - there is just a small dedicated IO thread pool, which you can quickly exhaust and stall / deadlock if you block. The informers used under the covers for Waitable or informOnCondition I left as just using the IO thread as we already had it documented and/or it's unlikely that the user would use blocking logic there - we can of course change that to use the task executor if needed.

In general I wanted to consolidate / hide / document the threading concerns as much as possible - anything that we do api-wise like Watcher.async(Watcher watcher, Executor executor) is just a workaround. The real solution is for the call back methods to return a CompletionStage or CompletableFuture - CompletionStage eventReceived(Action action, T resource) - or pass in a parameter that allows for triggering when to receive more events.

I'll refine the above into the migration doc if this all seems reasonable.

manusa · 2022-05-18T09:32:17Z

I think it might be better if we have a sync review on this one, especially to think about the executor removal options.

further minimizing SharedInformerFactoryImpl concerns and removing the use of the executor

doc/FAQ.md

manusa · 2022-05-19T10:12:40Z

kubernetes-client/src/main/java/io/fabric8/kubernetes/client/BaseClient.java

+
+    @Override
+    public Executor get() {
+      return Executors.newCachedThreadPool(Utils.daemonThreadFactory(this));


I'm concerned about this one.
When is the DEFAULT_EXECUTOR_SUPPLIER.get() method going to be invoked during the lifecycle of the application?
Should we just keep a single instance? Is there a use-case where we want to maintain more than one Executor?

It is called once per client. With the way things are currently every call to build() creates a fresh client. If the builder is only allowed to create a single client, then it could just be a single instance.

Shall we switch this to create lazily loaded singleton instance?

KubernetesClientBuilder is new, so whatever we do there won't affect existing users. However that is functionally different than many of the other Builders, including the HttpClient.Builder. Because you can still call the with methods after build it would probably be clearest if subsequent calls to build through an illegalstateexception.

it would probably be clearest if subsequent calls to build through an illegalstateexception.

Yup, I think that's even better.

Let's take that up as a follow-on task.

Co-authored-by: Marc Nuri <marc@marcnuri.com>

shawkins · 2022-05-19T12:25:06Z

@manusa assuming #4159 is going in first, I'll wait to rebase this until after that.

manusa · 2022-05-19T12:40:46Z

@manusa assuming #4159 is going in first, I'll wait to rebase this until after that.

OK, I'll merge #4159 because it's ready to merge and CI takes forever (If that wasn't the case, this one should go in first)

customresource # Conflicts: # kubernetes-client/src/main/java/io/fabric8/kubernetes/client/dsl/internal/OperationSupport.java # kubernetes-client/src/main/java/io/fabric8/kubernetes/client/dsl/internal/PortForwarderWebsocket.java # kubernetes-client/src/main/java/io/fabric8/kubernetes/client/dsl/internal/apps/v1/RollingUpdater.java

into customresource

customresource # Conflicts: # kubernetes-client/src/main/java/io/fabric8/kubernetes/client/dsl/internal/AbstractWatchManager.java # kubernetes-client/src/main/java/io/fabric8/kubernetes/client/dsl/internal/WatcherWebSocketListener.java # kubernetes-client/src/test/java/io/fabric8/kubernetes/client/dsl/internal/AbstractWatchManagerTest.java

manusa · 2022-06-01T04:24:35Z

The ExecWebSocketListenerTest never ends in the Java8 tests. Please take a look.

…customresource

shawkins · 2022-06-01T13:03:03Z

@manusa this seems to be happening on the BaseOperationTest. The only thing that really changed there is the usage of the ForkJoinPool.commonPool rather than the Utils common pool. In several places I have the tests using the ForkJoinPool.commonPool rather than inline execution to preserve the multi-threaded nature. Using the common pool helps the overall test execution by not creating / destroying so many thread pools. It could be possible that something is holding the common threadpool thread and the default on github for parallelism is small enough that we have a problem. We wouldn't have seen this before unless as the Utils common thread pool was unlimited in size.

manusa · 2022-06-01T13:06:07Z

I think this pool might be backed by the number of available cores, and I guess that in a CI env this must be probably just one. We should probably replace this by a pool with a fixed number of threads.

shawkins · 2022-06-01T13:19:57Z

I think this pool might be backed by the number of available cores, and I guess that in a CI env this must be probably just one.

It will default to Max(1, cores - 1).

We should probably replace this by a pool with a fixed number of threads.

There are several things. The first is to figure out what task is hanging around - it's not expected that something should indefinitely block.

The next step is that we can:

explicitly set java.util.concurrent.ForkJoinPool.common.parallelism (possibly to 1 in the maven properties) - I can reproduce the hang if I run locally with java.util.concurrent.ForkJoinPool.common.parallelism=1
or stop using the ForkJoinPool common pool because it's not really intended for arbitrary tasks. Fully managing a threadpool per test is possible and would help localize lingering tasks.

manusa · 2022-06-01T13:32:21Z

OK, I get your point now.

My assumption is that the commonPool might be used elsewhere (OkHttp) and we might be running into some lock. ---> If we replace the ForkJoinPool with a singlethreaded pool, does it still hang?

shawkins · 2022-06-01T20:28:09Z

My assumption is that the commonPool might be used elsewhere (OkHttp) and we might be running into some lock.

Turns out this is only occasionally reproducible on java 8 when the parallelism is 1. No task is hanging. The common pool simply won't execute the task.

If we replace the ForkJoinPool with a singlethreaded pool, does it still hang?

It appears fine. I've added another commit that removes the usage of the fork join common pool.

sonarcloud · 2022-06-01T20:56:09Z

SonarCloud Quality Gate failed.

0 Bugs
0 Vulnerabilities
0 Security Hotspots
3 Code Smells

53.5% Coverage
0.0% Duplication

shawkins requested review from manusa, rohanKanojia and metacosm May 15, 2022 15:33

shawkins force-pushed the customresource branch 2 times, most recently from 40cc132 to a7ab523 Compare May 15, 2022 18:55

metacosm reviewed May 16, 2022

View reviewed changes

doc/FAQ.md Outdated Show resolved Hide resolved

metacosm approved these changes May 16, 2022

View reviewed changes

shawkins force-pushed the customresource branch from a7ab523 to 356efce Compare May 17, 2022 17:06

shawkins force-pushed the customresource branch 3 times, most recently from 202253a to dabc2b9 Compare May 17, 2022 19:54

shawkins added 2 commits May 18, 2022 07:40

fix fabric8io#3569: removing all api references to customResource

0a59d5b

further minimizing SharedInformerFactoryImpl concerns and removing the use of the executor

rebasing and accounting for the portforwarder usage of the common pool

c43628a

shawkins force-pushed the customresource branch from dabc2b9 to c43628a Compare May 18, 2022 12:08

manusa mentioned this pull request May 18, 2022

refactor: extracted PortForwarderWebsocketListener and added tests #4159

Merged

11 tasks

manusa reviewed May 19, 2022

View reviewed changes

doc/FAQ.md Outdated Show resolved Hide resolved

manusa reviewed May 19, 2022

View reviewed changes

doc/FAQ.md Show resolved Hide resolved

manusa reviewed May 19, 2022

View reviewed changes

Update doc/FAQ.md

efeb932

Co-authored-by: Marc Nuri <marc@marcnuri.com>

manusa approved these changes May 19, 2022

View reviewed changes

shawkins added 2 commits May 19, 2022 15:09

Merge branch 'customresource' of github.com:shawkins/kubernetes-client

d20b263

into customresource

shawkins force-pushed the customresource branch from 151d301 to d20b263 Compare May 19, 2022 19:39

Merge branch 'master' of github.com:fabric8io/kubernetes-client into …

700d03a

…customresource

switching tests to use a shared single thread executor

2657998

manusa merged commit c4a3291 into fabric8io:master Jun 2, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix #3569: removing all api references to customResource #4148

fix #3569: removing all api references to customResource #4148

shawkins commented May 15, 2022 •

edited

metacosm left a comment

metacosm commented May 16, 2022

shawkins commented May 17, 2022

manusa commented May 18, 2022

manusa May 19, 2022

shawkins May 19, 2022

manusa May 19, 2022

shawkins May 19, 2022

manusa May 19, 2022

shawkins May 19, 2022

shawkins commented May 19, 2022

manusa commented May 19, 2022

manusa commented Jun 1, 2022

shawkins commented Jun 1, 2022

manusa commented Jun 1, 2022

shawkins commented Jun 1, 2022

manusa commented Jun 1, 2022

shawkins commented Jun 1, 2022

sonarcloud bot commented Jun 1, 2022

fix #3569: removing all api references to customResource #4148

fix #3569: removing all api references to customResource #4148

Conversation

shawkins commented May 15, 2022 • edited

Description

Type of change

Checklist

metacosm left a comment

Choose a reason for hiding this comment

metacosm commented May 16, 2022

shawkins commented May 17, 2022

manusa commented May 18, 2022

manusa May 19, 2022

Choose a reason for hiding this comment

shawkins May 19, 2022

Choose a reason for hiding this comment

manusa May 19, 2022

Choose a reason for hiding this comment

shawkins May 19, 2022

Choose a reason for hiding this comment

manusa May 19, 2022

Choose a reason for hiding this comment

shawkins May 19, 2022

Choose a reason for hiding this comment

shawkins commented May 19, 2022

manusa commented May 19, 2022

manusa commented Jun 1, 2022

shawkins commented Jun 1, 2022

manusa commented Jun 1, 2022

shawkins commented Jun 1, 2022

manusa commented Jun 1, 2022

shawkins commented Jun 1, 2022

sonarcloud bot commented Jun 1, 2022

shawkins commented May 15, 2022 •

edited