Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Operator cannot start because indexer blocks thread for more than 2secs #2353

Open
Javatar81 opened this issue Apr 18, 2024 · 2 comments
Open

Comments

@Javatar81
Copy link

Bug Report

What did you do?

I've created a custom resource that is watched by my reconciler and started the operator.

What did you expect to see?

The operator should run without problems.

What did you see instead? Under which circumstances?

When there is at least one custom resource the operator is watching, the operator fails with the following exception:

WARN  [io.ver.cor.imp.BlockedThreadChecker] (vertx-blocked-thread-checker) Thread Thread[vert.x-eventloop-thread-11,5,main] has been blocked for 3633 ms, time limit is 2000 ms: io.vertx.core.VertxException: Thread blocked
	at java.base/jdk.internal.misc.Unsafe.park(Native Method)
	at java.base/java.util.concurrent.locks.LockSupport.park(LockSupport.java:221)
	at java.base/java.util.concurrent.CompletableFuture$Signaller.block(CompletableFuture.java:1864)
	at java.base/java.util.concurrent.ForkJoinPool.unmanagedBlock(ForkJoinPool.java:3780)
	at java.base/java.util.concurrent.ForkJoinPool.managedBlock(ForkJoinPool.java:3725)
	at java.base/java.util.concurrent.CompletableFuture.waitingGet(CompletableFuture.java:1898)
	at java.base/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:2072)
	at io.fabric8.kubernetes.client.dsl.internal.OperationSupport.waitForResult(OperationSupport.java:491)
	at io.fabric8.kubernetes.client.dsl.internal.OperationSupport.handleResponse(OperationSupport.java:524)
	at io.fabric8.kubernetes.client.dsl.internal.OperationSupport.handleGet(OperationSupport.java:467)
	at io.fabric8.kubernetes.client.dsl.internal.BaseOperation.handleGet(BaseOperation.java:792)
	at io.fabric8.kubernetes.client.dsl.internal.BaseOperation.requireFromServer(BaseOperation.java:193)
	at io.fabric8.kubernetes.client.dsl.internal.BaseOperation.get(BaseOperation.java:149)
	at io.fabric8.kubernetes.client.dsl.internal.BaseOperation.get(BaseOperation.java:98)
	at io.devjoy.gitea.organization.k8s.model.GiteaOrganization.associatedGitea(GiteaOrganization.java:38)
	at io.devjoy.gitea.organization.k8s.OrganizationReconciler.lambda$prepareEventSources$8(OrganizationReconciler.java:120)
	at io.fabric8.kubernetes.client.informers.impl.cache.CacheImpl.updateIndex(CacheImpl.java:278)
	at io.fabric8.kubernetes.client.informers.impl.cache.CacheImpl.updateIndices(CacheImpl.java:273)
	at io.fabric8.kubernetes.client.informers.impl.cache.CacheImpl.put(CacheImpl.java:103)
	at io.fabric8.kubernetes.client.informers.impl.cache.ProcessorStore.updateInternal(ProcessorStore.java:54)
	at java.base/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:197)
	at java.base/java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1625)
	at java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:509)
	at java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:499)
	at java.base/java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150)
	at java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173)
	at java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
	at java.base/java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:596)
	at io.fabric8.kubernetes.client.informers.impl.cache.ProcessorStore.update(ProcessorStore.java:50)
	at io.fabric8.kubernetes.client.informers.impl.cache.Reflector.lambda$processList$7(Reflector.java:188)
	at java.base/java.util.concurrent.CompletableFuture$UniCompose.tryFire(CompletableFuture.java:1150)
	at java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:510)
	at java.base/java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:2179)
	at io.fabric8.kubernetes.client.http.StandardHttpClient.lambda$completeOrCancel$10(StandardHttpClient.java:142)
	at java.base/java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:863)
	at java.base/java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:841)
	at java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:510)
	at java.base/java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:2179)
	at io.fabric8.kubernetes.client.http.ByteArrayBodyHandler.onBodyDone(ByteArrayBodyHandler.java:51)
	at java.base/java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:863)
	at java.base/java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:841)
	at java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:510)
	at java.base/java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:2179)
	at io.fabric8.kubernetes.client.vertx.VertxHttpRequest.lambda$null$1(VertxHttpRequest.java:121)
	at io.vertx.core.impl.ContextInternal.dispatch(ContextInternal.java:276)
	at io.vertx.core.impl.ContextInternal.dispatch(ContextInternal.java:258)
	at io.vertx.core.http.impl.HttpEventHandler.handleEnd(HttpEventHandler.java:76)
	at io.vertx.core.http.impl.HttpClientResponseImpl.handleEnd(HttpClientResponseImpl.java:250)
	at io.vertx.core.http.impl.Http1xClientConnection$StreamImpl.lambda$new$0(Http1xClientConnection.java:418)
	at io.vertx.core.streams.impl.InboundBuffer.handleEvent(InboundBuffer.java:255)
	at io.vertx.core.streams.impl.InboundBuffer.write(InboundBuffer.java:134)
	at io.vertx.core.http.impl.Http1xClientConnection$StreamImpl.handleEnd(Http1xClientConnection.java:705)
	at io.vertx.core.impl.ContextImpl.lambda$execute$4(ContextImpl.java:322)
	at io.netty.util.concurrent.AbstractEventExecutor.runTask(AbstractEventExecutor.java:173)
	at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:166)
	at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:470)
	at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:569)
	at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997)
	at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
	at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
	at java.base/java.lang.Thread.run(Thread.java:1623)

The problem does not occur when there is no such CR when the operator starts. After it is started it runs smoothly although custom resources are created.

Environment

Kubernetes cluster type:

OpenShift

$ Mention java-operator-sdk version from pom.xml file

6.6.7

$ java -version

openjdk version "20.0.1" 2023-04-18
OpenJDK Runtime Environment Temurin-20.0.1+9 (build 20.0.1+9)
OpenJDK 64-Bit Server VM Temurin-20.0.1+9 (build 20.0.1+9, mixed mode)

$ kubectl version

Client Version: v1.28.1
Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
Server Version: v1.27.11+749fe1d

Possible Solution

No

Additional context

The reconciler overrides prepareEventSources and adds an indexer. This indexer has a mapping function that makes a call to fetch a resource via fabric8 Kubernetes client. Here is a link to the source code: https://github.com/Javatar81/devjoy/blob/main/operators/gitea/gitea-operator/src/main/java/io/devjoy/gitea/organization/k8s/OrganizationReconciler.java#L116

@csviri
Copy link
Collaborator

csviri commented Apr 18, 2024

thx @Javatar81 we will take a look, this might be an issue in fabric8 client,
cc @shawkins @manusa

@shawkins
Copy link
Collaborator

this might be an issue in fabric8 client

This looks like issues we have seen in quarkus where a blocking call to the client is not expected by the event loop thread. The solution there was to add a Blocking annotation.

The other thought here is that it's a pretty heavy weight indexing operation being performed - does that need to be done syncrhonously?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants