Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ryuk container name already in use #2395

Closed
toadzky opened this issue Mar 3, 2020 · 26 comments
Closed

Ryuk container name already in use #2395

toadzky opened this issue Mar 3, 2020 · 26 comments
Labels

Comments

@toadzky
Copy link
Contributor

toadzky commented Mar 3, 2020

The error message from docker is:

Conflict. The container name "/testcontainers-ryuk-181a7bc5-71b4-41c5-89ee-b5f0b0f378ed" is already in use by container "7b89e6644ce929b67b9934dd1caf2b414f98e7143490b20d2cdcc36645c60830". You have to remove (or rename) that container to be able to reuse that name.

Based on looking through the code, you are creating a static SESSION_ID in the DockerClientFactory but you are creating a Ryuk container per docker client. There isn't a check for an existing container with that id.

I'm not sure what the correct solution is. The 2 options I see are:

  • have a per-client session id
  • handle the container already existing by checking and returning the id

We are using version 1.12.5

This happens during CI builds using docker-in-docker on BuildKite. EC2 hosts running Amazon Linux. Does not happen locally on macOS or on linux laptops.

@vikrant82
Copy link

vikrant82 commented Mar 5, 2020

Also affected by this. Same setup.

@ghost
Copy link

ghost commented Mar 22, 2020

Affected by this using version 1.13.0

@Leonti
Copy link

Leonti commented Mar 24, 2020

My tests worked fine on 1.12.4, but start to fail 1.13.0 with this error.
Also running docker-in-docker.

@stephanebastian
Copy link

For information, we've got the same issue with an ElasticSearch container. Worked great with version 1.10.6 but ran into this bug when we upgraded to 1.13.0

@Leonti
Copy link

Leonti commented Apr 14, 2020

The issue seems to be fixed in 1.14.0

@r-sniper
Copy link

I am facing the same issue on version 1.14.0.
Tests work fine on local machine, it just fails on Jenkins with
Caused by: com.github.dockerjava.api.exception.ConflictException: {"message":"Conflict. The container name "/testcontainers-ryuk-ea491299-e6f7-432e-b5f2-cb4cda1c38f5" is already in use by container "99297d75a66c387adc1f29e2e40f0224bbf75587e6fcb72fff4c2e0b94027c94". You have to remove (or rename) that container to be able to reuse that name."}

@bsideup
Copy link
Member

bsideup commented Apr 17, 2020

For everyone who is getting the issue:
Please report your test setup. What containers are you using, how do you start them, etc etc

Also, make sure you're pasting the full exception, not just the message.

@r-sniper
Copy link

@bsideup
In my tests, I use my 2 custom containers which are built using fabric8 docker-maven-plugin at maven package phase. Tests run at maven verify

Test Configuration

  • Single test class
  • Using junit5 for running tests: v5.6.2
  • testcontainers version : 1.14.0
  • I have my custom 2 images, lets call imageA and imageB. Lifecycle of these containers is managed by testcontainers by using @Container annotation
  • The whole cascading exception is about 100 lines or so, and is availble at testcontainersException

Please let me know if anything more is required

@bsideup
Copy link
Member

bsideup commented Apr 17, 2020

Can others confirm that they use JUnit 5 and/or Startables.deepStart?

@toadzky
Copy link
Contributor Author

toadzky commented Apr 17, 2020

i'm using junit5 as well. not the extension that ships with the library, we wrote out own, but it is junit5.

@r-sniper
Copy link

Update:
I was using testcontainers inside Jenkins container
Changing the Jenkins container network mode from bridge to host and mapping the jnekins directory in same location as it was inside by following testcontainers:Docker in docker solved this issue

@aguibert
Copy link
Contributor

@bsideup I recently hit this error and we are using JUnit 4 and are not using Startables.deepStart.

Here is how the container is defined:

    @ClassRule
    public static CouchDBContainer couchdb = new CouchDBContainer(new ImageFromDockerfile()
                    .withDockerfileFromBuilder(builder -> builder.from("couchdb:1.7")
                                    .copy("/opt/couchdb/etc/local.d/testcontainers_config.ini", "/opt/couchdb/etc/local.d/testcontainers_config.ini")
                                    .copy("/etc/couchdb/cert/couchdb.pem", "/etc/couchdb/cert/couchdb.pem")
                                    .copy("/etc/couchdb/cert/privkey.pem", "/etc/couchdb/cert/privkey.pem")
                                    .build())
                    .withFileFromFile("/opt/couchdb/etc/local.d/testcontainers_config.ini", new File("lib/LibertyFATTestFiles/couchdb-config/testcontainers_config.ini"), 644)
                    .withFileFromFile("/etc/couchdb/cert/couchdb.pem", new File("lib/LibertyFATTestFiles/ssl-certs/couchdb.pem"), 644)
                    .withFileFromFile("/etc/couchdb/cert/privkey.pem", new File("lib/LibertyFATTestFiles/ssl-certs/privkey.pem"), 644))
                                    .withLogConsumer(FATSuite::log);

Here is a link to the src with additional context:
https://github.com/OpenLiberty/open-liberty/blob/master/dev/com.ibm.ws.couchdb_fat/fat/src/com/ibm/ws/couchdb/fat/FATSuite.java#L37

And here is the complete stack trace (notice that it is all stemming from a logging statement in GenericContainer.doStart())

org.testcontainers.containers.ContainerLaunchException: Container startup failed
at org.testcontainers.containers.GenericContainer.doStart(GenericContainer.java:320)
at org.testcontainers.containers.GenericContainer.start(GenericContainer.java:300)
at org.testcontainers.containers.GenericContainer.starting(GenericContainer.java:1011)
at org.testcontainers.containers.FailureDetectingExternalResource$1.evaluate(FailureDetectingExternalResource.java:29)
Caused by: com.github.dockerjava.api.exception.ConflictException: {"message":"Conflict. The container name \"/testcontainers-ryuk-efaa6546-d512-42f6-aaea-912af1ec2d81\" is already in use by container \"80e8a01c729781819e8a072fc6b810fe829fb3c33eb91b891e7727ed2145fd23\". You have to remove (or rename) that container to be able to reuse that name."}
at com.github.dockerjava.okhttp.OkHttpInvocationBuilder.execute(OkHttpInvocationBuilder.java:291)
at com.github.dockerjava.okhttp.OkHttpInvocationBuilder.execute(OkHttpInvocationBuilder.java:271)
at com.github.dockerjava.okhttp.OkHttpInvocationBuilder.post(OkHttpInvocationBuilder.java:129)
at com.github.dockerjava.core.exec.CreateContainerCmdExec.execute(CreateContainerCmdExec.java:33)
at com.github.dockerjava.core.exec.CreateContainerCmdExec.execute(CreateContainerCmdExec.java:13)
at com.github.dockerjava.core.exec.AbstrSyncDockerCmdExec.exec(AbstrSyncDockerCmdExec.java:21)
at com.github.dockerjava.core.command.AbstrDockerCmd.exec(AbstrDockerCmd.java:35)
at com.github.dockerjava.core.command.CreateContainerCmdImpl.exec(CreateContainerCmdImpl.java:595)
at org.testcontainers.utility.ResourceReaper.start(ResourceReaper.java:91)
at org.testcontainers.DockerClientFactory.client(DockerClientFactory.java:155)
at org.testcontainers.images.builder.ImageFromDockerfile.resolve(ImageFromDockerfile.java:84)
at org.testcontainers.images.builder.ImageFromDockerfile.resolve(ImageFromDockerfile.java:37)
at org.testcontainers.utility.LazyFuture.getResolvedValue(LazyFuture.java:20)
at org.testcontainers.utility.LazyFuture.get(LazyFuture.java:27)
at org.testcontainers.shaded.com.google.common.util.concurrent.Futures$3.get(Futures.java:1332)
at org.testcontainers.images.RemoteDockerImage.getImageName(RemoteDockerImage.java:97)
at org.testcontainers.images.RemoteDockerImage.imageNameToString(RemoteDockerImage.java:107)
at org.testcontainers.images.RemoteDockerImage.toString(RemoteDockerImage.java:26)
at java.base/java.lang.String.valueOf(String.java:2951)
at java.base/java.lang.StringBuilder.append(StringBuilder.java:168)
at org.testcontainers.containers.GenericContainer.getDockerImageName(GenericContainer.java:1268)
at org.testcontainers.containers.GenericContainer.logger(GenericContainer.java:603)
at org.testcontainers.containers.GenericContainer.doStart(GenericContainer.java:309) 

@aguibert
Copy link
Contributor

@bsideup so I guess my findings are sort of along the same lines as Startables.deepStart, because we have 2 things trying to potentially issue the docker image name at the same time?

I think that if we simply removed that log statement it would fix the issue, but maybe the code needs a more thorough review for getDockerImageName() kicking off a significant amount of work for the ImageFromDockerfile usage pattern, or perhaps we could offer an alternative to getDockerImageName() that can be easily used to identify the docker image name without flowing any docker commands?

@bsideup
Copy link
Member

bsideup commented Jun 24, 2020

@aguibert FYI I just submitted #2930 that should fix eager resolve of LazyFuture from RemoteDockerImage#toString (that is triggered from GenericContainer#toString, for example).
It is unrelated to the Ryuk problem but seems to be a miss from #2558

BTW from the stacktrace, it looks like you're using a version that is older that 1.14.x, consider updating.

@aguibert
Copy link
Contributor

hmm, we're using 1.14.0, not the latest, but it is 1.14.X

@bsideup
Copy link
Member

bsideup commented Jun 25, 2020

Okay, after looking at the code, I noticed that we were not caching Ryuk's failure: #2935

So, the actual problem was that ResourceReaper was failing to connect to started Ryuk container, but another attempt at getting a client would not know that and attempt to start it again.

This logs confirms that:
https://pastebin.com/pZHXKzRF

Caused by: java.lang.IllegalStateException: Can not connect to Ryuk

@aguibert
Copy link
Contributor

great find @bsideup, I think this issue can be closed now that #2935 is in right? Or do you want to wait for users to verify?

@enote-kane
Copy link

Desperately awaiting a release version with these fixes included (Jenkins here).

@bsideup
Copy link
Member

bsideup commented Jul 10, 2020

@enote-kane well, what was fixed won't fix your Jenkins issue. There is still an issue with "failed to connect to Ryuk" on your environment, just now it will be clear (before, it was hidden behind "container name already in use" because it was trying to make another attempt at starting Ryuk but it was already running (but failed to connect to it)

@enote-kane
Copy link

@bsideup Oh, that's bad. I was wondering why this is the case only on our EC2 Jenkins slaves running Ubuntu whereas we also have other Ubuntu clients where we where unable to reproduce this issue at all.
Is there another bug for the connect problem?

@bsideup
Copy link
Member

bsideup commented Jul 10, 2020

@enote-kane there are no known bugs with Ryuk. Please check your network settings, firewall, debug connectivity, etc etc.

@enote-kane
Copy link

@bsideup Hm, I am confused as I wasn't able to spot any connectivity issues in my log output, but will try to have more debug logging.

@enote-kane
Copy link

enote-kane commented Jul 10, 2020

OK, after looking again, I think you mean this, right?

21:48:55.030 [main] DEBUG com.github.dockerjava.core.command.AbstrDockerCmd - Cmd: c08828298fdc19f416eada33b0d6a3b844127853f12cdfca7e2e6667cb00ee24
21:48:55.532 [main] DEBUG com.github.dockerjava.core.command.AbstrDockerCmd - Cmd: c08828298fdc19f416eada33b0d6a3b844127853f12cdfca7e2e6667cb00ee24,false
21:48:55.533 [main] DEBUG com.github.dockerjava.core.exec.InspectContainerCmdExec - GET: com.github.dockerjava.okhttp.OkHttpWebTarget@13006998
21:49:25.629 [main] ERROR org.testcontainers.utility.ResourceReaper - Timeout out waiting for Ryuk. Ryuk's log: <<<
2020/07/09 21:48:55 Pinging Docker...
2020/07/09 21:48:55 Docker daemon is available!
2020/07/09 21:48:55 Starting on port 8080...
2020/07/09 21:48:55 Started!

But I don't get why this should be different when run by a Jenkins pipeline than any such other environment (docker-in-docker).

The actual command that is being executed is like this:

docker run -t -u 1001:1001 --network=bridge \
    -v /var/run/docker.sock:/var/run/docker.sock \
    -u 1001:999 # jenkins user ID + docker group ID \
    -w /opt/jenkins/workspace/... \
    -v /opt/jenkins/workspace/...:/opt/jenkins/workspace/...:rw,z \
    -v /opt/jenkins/workspace/...@tmp:/opt/jenkins/workspace/...@tmp:rw,z \
    -e ... \
    maven:3-openjdk-14 \
    mvn --batch-mode --strict-checksums test -DtrimStackTrace=false

So the only thing I could think of is the standard bridge network that get's in the way here, right?

@stale
Copy link

stale bot commented Oct 11, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. If you believe this is a mistake, please reply to this comment to keep it open. If there isn't one already, a PR to fix or at least reproduce the problem in a test case will always help us get back on track to tackle this.

@stale stale bot added the stale label Oct 11, 2020
@stale
Copy link

stale bot commented Oct 25, 2020

This issue has been automatically closed due to inactivity. We apologise if this is still an active problem for you, and would ask you to re-open the issue if this is the case.

@stale stale bot closed this as completed Oct 25, 2020
@martinandersson
Copy link

Please reopen. I am having this issue on a very uncomplicated "hello world" type of project, rendering the TestContainers library unusable at the moment.

Apr 26, 2021 3:06:41 PM org.junit.jupiter.engine.execution.JupiterEngineExecutionContext close
SEVERE: Caught exception while closing extension context: org.junit.jupiter.engine.descriptor.ClassExtensionContext@d72284c
com.github.dockerjava.api.exception.ConflictException: Status 409: {"message":"Conflict. The container name "/testcontainers-ryuk-f3039b1f-d175-4c4f-981b-aca78f782347" is already in use by container "882f294ea3178cf5d254e78f782e73ee8bd53430d4388b5ecaffab3a9e81ae5b". You have to remove (or rename) that container to be able to reuse that name."}

My Gradle build file:

largeTestImplementation 'org.testcontainers:junit-jupiter:1.15.3'
largeTestImplementation 'org.testcontainers:postgresql:1.15.3'

My machine is a Hyper-V virtual machine (Windows 10 x64) running Docker (20.10.5) on WSL2 (which technically becomes a nested VM). WSL2's distro is Ubuntu 20.04.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

9 participants