Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docker:start always blocks on windows #1103

Closed
SimY4 opened this issue Oct 3, 2018 · 25 comments · Fixed by #1480 or #1481
Closed

docker:start always blocks on windows #1103

SimY4 opened this issue Oct 3, 2018 · 25 comments · Fixed by #1480 or #1481

Comments

@SimY4
Copy link

SimY4 commented Oct 3, 2018

Description

Whenever I use docker:start on windows it's always blocking build execution although docker container was successfully started and wait condition satisfied.

I'm using the latest plugin version. I tried explicitly specify docker.follow=false - didn't help. The same config works fine with docker for OSX.

Info

  • d-m-p version : 0.27.1
  • Maven version (mvn -v) : 3.5.4
  • Docker version : 18.06.1-ce-win73
  • If it's a bug, how to reproduce :
    My docker execution config looks like this:
            <plugin>
                <groupId>io.fabric8</groupId>
                <artifactId>docker-maven-plugin</artifactId>
                <version>0.27.1</version>
                <configuration>
                    <images>
                        <image>
                            <name>localstack/localstack:0.8.7</name>
                            <run>
                                <env>
                                    <SERVICES>dynamodb</SERVICES>
                                </env>
                                <ports>
                                    <port>4569:4569</port>
                                </ports>
                                <log/>
                                <wait>
                                    <log>^Ready\.$</log>
                                </wait>
                            </run>
                        </image>
                    </images>
                    <skip>${skipTests}</skip>
                </configuration>
                <executions>
                    <execution>
                        <goals>
                            <goal>start</goal>
                            <goal>stop</goal>
                        </goals>
                    </execution>
                </executions>
            </plugin>

then execution docker:start -X will show that waitcondition was met, execution returned 0 and then process hangs.

@rhuss
Copy link
Collaborator

rhuss commented Oct 3, 2018

I suspect that there is an issue with the wait condition. What happens if you remove wait ? Does it still block ? And what is, if you loosen the condition to e.g. not use a regexp ? Does this change anything ?

Debugging on Windows is difficult for me as I don't have a installation at hand. But it could be that the log parsing works differently on Windows.

It would also help if you could run it with mvn -X to get some more debugging.

@ebouvi01
Copy link

Hi, running the 0.24 version fixed my problem.

Looks like #904 was lost in 0.25 and upwards.

@rhuss
Copy link
Collaborator

rhuss commented Apr 16, 2019

#904 was partly changed in #965 because of a regression this fix introduced. So I think both fixes need to be combined but I have no idea yet, how (to far away from that code at the moment). Maybe @FSeidinger and @rkhmelichek have an idea what the issue is ?

@jeacott1
Copy link

Hi, running the 0.24 version fixed my problem.

Looks like #904 was lost in 0.25 and upwards.

0.24 didn't fix hanging from run for me unfortunately.
I have a setup like

        <wait>
               <time>70000</time>
               <exit>0</exit>
        </wait>

the process executes and I get a "Waited on exit code 0 58700 ms" at the end, but then it just hangs forever.
I can't not set a because my process takes longer than 10 seconds.

@SimY4
Copy link
Author

SimY4 commented Jan 4, 2021

Got some time to experiment with this:

I created a small project with the setup described in the ticket here (https://github.com/SimY4/ImmutablesDynamoDBMapperPOC) and collected some thread dumps.

The interesting part is this one:

"main@1" prio=5 tid=0x1 nid=NA waiting
  java.lang.Thread.State: WAITING
	  at java.lang.Object.wait(Object.java:-1)
	  at sun.nio.ch.NativeThreadSet.signalAndWait(NativeThreadSet.java:101)
	  at sun.nio.ch.FileChannelImpl.implCloseChannel(FileChannelImpl.java:184)
	  at java.nio.channels.spi.AbstractInterruptibleChannel.close(AbstractInterruptibleChannel.java:112)
	  - locked <0x10c3> (a java.lang.Object)
	  at io.fabric8.maven.docker.access.hc.win.NamedPipe.close(NamedPipe.java:261)
	  at org.apache.http.impl.BHttpConnectionBase.shutdown(BHttpConnectionBase.java:309)
	  at org.apache.http.impl.conn.DefaultManagedHttpClientConnection.shutdown(DefaultManagedHttpClientConnection.java:95)
	  at org.apache.http.impl.conn.LoggingManagedHttpClientConnection.shutdown(LoggingManagedHttpClientConnection.java:98)
	  at org.apache.http.impl.execchain.ConnectionHolder.abortConnection(ConnectionHolder.java:128)
	  - locked <0x10c5> (a org.apache.http.impl.conn.LoggingManagedHttpClientConnection)
	  at org.apache.http.impl.execchain.ConnectionHolder.cancel(ConnectionHolder.java:146)
	  at org.apache.http.client.methods.AbstractExecutionAwareRequest.abort(AbstractExecutionAwareRequest.java:90)
	  at io.fabric8.maven.docker.access.log.LogRequestor.finish(LogRequestor.java:212)
	  at io.fabric8.maven.docker.wait.LogWaitChecker.cleanUp(LogWaitChecker.java:45)
	  at io.fabric8.maven.docker.wait.WaitUtil.cleanup(WaitUtil.java:80)
	  at io.fabric8.maven.docker.wait.WaitUtil.wait(WaitUtil.java:64)
	  at io.fabric8.maven.docker.service.WaitService.wait(WaitService.java:65)
	  at io.fabric8.maven.docker.service.helper.StartContainerExecutor.waitAndPostExec(StartContainerExecutor.java:103)
	  at io.fabric8.maven.docker.service.helper.StartContainerExecutor.startContainer(StartContainerExecutor.java:52)
	  at io.fabric8.maven.docker.StartMojo.lambda$startImage$0(StartMojo.java:299)
	  at io.fabric8.maven.docker.StartMojo$$Lambda$52.1629173206.call(Unknown Source:-1)
	  at com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:125)
	  at com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:69)
	  at com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:78)
	  at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
	  at java.util.concurrent.FutureTask.run$$$capture(FutureTask.java:264)
	  at java.util.concurrent.FutureTask.run(FutureTask.java:-1)
	  at com.google.common.util.concurrent.MoreExecutors$DirectExecutorService.execute(MoreExecutors.java:321)
	  at java.util.concurrent.ExecutorCompletionService.submit(ExecutorCompletionService.java:184)
	  at io.fabric8.maven.docker.StartMojo.startImage(StartMojo.java:297)
	  at io.fabric8.maven.docker.StartMojo.executeInternal(StartMojo.java:166)
	  - locked <0x10d4> (a io.fabric8.maven.docker.StartMojo)
	  at io.fabric8.maven.docker.AbstractDockerMojo.execute(AbstractDockerMojo.java:248)
	  at org.apache.maven.plugin.DefaultBuildPluginManager.executeMojo(DefaultBuildPluginManager.java:137)
	  at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:210)
	  at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:156)
	  at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:148)
	  at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:117)
	  at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:81)
	  at org.apache.maven.lifecycle.internal.builder.singlethreaded.SingleThreadedBuilder.build(SingleThreadedBuilder.java:56)
	  at org.apache.maven.lifecycle.internal.LifecycleStarter.execute(LifecycleStarter.java:128)
	  at org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:305)
	  at org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:192)
	  at org.apache.maven.DefaultMaven.execute(DefaultMaven.java:105)
	  at org.apache.maven.cli.MavenCli.execute(MavenCli.java:957)
	  at org.apache.maven.cli.MavenCli.doMain(MavenCli.java:289)
	  at org.apache.maven.cli.MavenCli.main(MavenCli.java:193)
	  at jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(NativeMethodAccessorImpl.java:-1)
	  at jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	  at jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	  at java.lang.reflect.Method.invoke(Method.java:566)
	  at org.codehaus.plexus.classworlds.launcher.Launcher.launchEnhanced(Launcher.java:282)
	  at org.codehaus.plexus.classworlds.launcher.Launcher.launch(Launcher.java:225)
	  at org.codehaus.plexus.classworlds.launcher.Launcher.mainWithExitCode(Launcher.java:406)
	  at org.codehaus.plexus.classworlds.launcher.Launcher.main(Launcher.java:347)
	  at org.codehaus.classworlds.Launcher.main(Launcher.java:47)

This thread is spinning infinitely attempting to close the file channel.

I found something looking very similar in JDK bug tracker reported a while back but still open till this day: https://bugs.openjdk.java.net/browse/JDK-8132693

@ncelerier
Copy link
Contributor

We also have this problem. 0.24 is working fine, 025.x not anymore and the problem is still present in the current latest version 0.36.

@rohanKanojia
Copy link
Member

@ncelerier : Would it be possible for you to debug this issue on Windows?

@ncelerier
Copy link
Contributor

Maybe, how can I debug that ?

@rohanKanojia
Copy link
Member

You can debug the build process using mvnDebug docker:start, you can connect from your IDE using remote debug option. Then you'll need to set a breakpoint in StartMojo and investigate why it's blocking

ncelerier added a commit to ncelerier/docker-maven-plugin that referenced this issue Jun 21, 2021
ncelerier added a commit to ncelerier/docker-maven-plugin that referenced this issue Jun 21, 2021
@ncelerier
Copy link
Contributor

ncelerier commented Jun 21, 2021

Ok, I have done that but since I didn't really know what to debug exactly, I have been looking for the changes that introduced the bug, which are from #965, and reverted them. As you can see in the merge request #1480, this is somehow related to how the InputStream is closed.

@rohanKanojia
Copy link
Member

@ncelerier : Does docker:start goal work as expected on windows with this change?

@ncelerier
Copy link
Contributor

Yes it looks like it is working properly, I can make calls to my application and it behaves properly.

But of course it would be better if you or someone else could confirm that it is still working properly. I'm not sure it does not introduce a regression.

@rohanKanojia
Copy link
Member

I'm on Linux and for me start, stop and log goals seem to be working okay with your changes. I'd say let's get your change in and see if causes any problems for other users.

@rohanKanojia
Copy link
Member

@ncelerier : Would you be needing a patch release with the fix? I was thinking of asking people in the issue to try out the patch release and get feedback before closing this issue.

@ncelerier
Copy link
Contributor

@rohanKanojia: yes if it is not a problem for you, a patch release would be good for me.

@rohanKanojia
Copy link
Member

@ncelerier : cool, I'll try to cut 0.36.1 tomorrow afternoon. Thanks a lot for your contribution 👍

@rohanKanojia
Copy link
Member

I've released 0.36.1 with @ncelerier 's fix.

@SimY4 @jeacott1 @ebouvi01: Could you please help us out by trying out 0.36.1 on windows and providing feedback on whether this issue is resolved or not whenever you have some time? If it's working we can go ahead and close this.

@SimY4
Copy link
Author

SimY4 commented Jun 28, 2021

@rohanKanojia looks like it's still getting stuck after successfully matching log condition for me. I'm using the latest Docker for Windows 3.5.1, my pet project where issue reproducible is https://github.com/SimY4/ImmutablesDynamoDBMapperPOC

@rohanKanojia
Copy link
Member

@ncelerier: Hello, Would it be possible for you to try out @SimY4 's sample on Windows whenever you get some time?

@ncelerier
Copy link
Contributor

@rohanKanojia: Hello, unfortunately, @SimY4 sample hangs for me too :(

@rohanKanojia
Copy link
Member

Unfortunately for me on Linux docker:start finishes normally 😕

@ncelerier
Copy link
Contributor

ncelerier commented Jun 29, 2021

I have run Sim4Y sample in debug and I have found that is.available() returns 153, which means that the is.close() method is called (which triggers the hanging). With my application, it was returning 0 which means the is.close() method was not called.

Because of this, I wonder if we need to care about the is.available() value to determine whether or not we should close the InputStream ? Probably not.

Also, here is what I see when I add some logs

[INFO] DOCKER> Pattern '^Ready\.$' matched for container f1f8cb78ae5f
f1f8cb> Ready.
*** Calling is.close...
*** Calling request.abort...

Then it hangs, so it seems there is a deadlock between these two.

I tought maybe I can wait for the finish method to complete (which is calling request.abort()) before calling is.close(), which can be done like this:

final InputStream is = response.getEntity().getContent();

try {
    while (true) {
        if (!readStreamFrame(is)) {
            return;
        }
    }
} finally {
    // When finish() returns, request == null
    while (request != null) {
        WaitUtil.sleep(500);
    }
    
    if ((is != null)) {
        System.out.println("*** Calling is.close...");
        is.close();
        System.out.println("*** Calling is.close: done.");
    }
}
...
public void finish() {
    if (request != null) {
        System.out.println("*** Calling request.abort...");
        request.abort();
        System.out.println("*** Calling request.abort: done.");
        request = null;
    }
}

With this change, it is not hanging anymore. But here is what I see in the logs:

*** Calling request.abort...
*** Calling request.abort: done.
*** Calling is.close...
[ERROR] DOCKER> IO Error while requesting logs: java.nio.channels.ClosedChannelException Thread-4

Does that mean there is no need to call is.close() because it is already? The request.abort() seems to trigger it somehow.

@rohanKanojia, what do you think ? Can you try with these changes on Linux ?

@rohanKanojia
Copy link
Member

@ncelerier : Do you have your branch somewhere which I can pull and test?

ncelerier added a commit to ncelerier/docker-maven-plugin that referenced this issue Jun 29, 2021
@ncelerier
Copy link
Contributor

Yes, I have just created the merge request.

@rohanKanojia
Copy link
Member

@ncelerier : Thanks a lot, I'll try to pull it and check after work later today.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
6 participants