Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extend docker cp to permit copying from images #16079

Closed
WhisperingChaos opened this issue Sep 4, 2015 · 51 comments
Closed

Extend docker cp to permit copying from images #16079

WhisperingChaos opened this issue Sep 4, 2015 · 51 comments
Labels
kind/feature Functionality or other elements that the project doesn't currently have. Features are new and shiny

Comments

@WhisperingChaos
Copy link
Contributor

Permit docker cp to copy from an image but fail when copying to an image.

Perhaps a [docker cp] command that targets an image doesn't need to fail.

Motivation.

Although declined, feature was implemented as bash git hub project dkrcp

@jessfraz jessfraz added kind/feature Functionality or other elements that the project doesn't currently have. Features are new and shiny and removed kind/feature Functionality or other elements that the project doesn't currently have. Features are new and shiny kind/proposal labels Sep 8, 2015
@calavera
Copy link
Contributor

This would make images mutable objects, and the whole point of images is to be immutable. I'm pretty 👎 on this, but I'd like to hear what other people think about it, specially @jlhawn.

@duglin
Copy link
Contributor

duglin commented Sep 11, 2015

@calavera I think he wants to just copy stuff from images, not into them - so images would still be immutable. Its an interesting feature - I'd like to hear more about the usecases for it though.

@jlhawn
Copy link
Contributor

jlhawn commented Sep 12, 2015

You can already get the desired behavior by simply creating a container using the image first.

docker create ubuntu:14.04 for example, will create a container from the image but not start it. You can copy to/from the container in this state. You just have to remember to docker rm that container when you are done.

@WhisperingChaos
Copy link
Contributor Author

@calavera

As mentioned in the request, docker cp would fail if it attempted to target an image. This failure would be analogous to traditional cp semantics that reject the command due to the target location's attributes, like a read only directory/file system, or DAC permission conflicts.

Although initially proposed to fail when the target argument referred to an image, what if docker cp accepted an image reference as a target argument? An image's immutable constraint could be preserved by implementing the docker cp to create an image when the specified target:

  • is absent: A new image is created by simply adding the source files to an empty file system. Similar to the following Dockerfile where the build context mirrors the docker cp source file tree:
FROM scratch
COPY . /
  • identifies an existing image: This situation would generate a new image. Similar to the following Dockerfile where the build context mirrors the docker cp source file tree:
FROM <ExistingImageReference>
COPY . /

The naming conventions for the new image would assume the semantics applied by the docker build command.

In other words, a docker cp command that targets an image is implemented as a docker build where the build context mirrors the docker cp's source file tree and needed Dockerfile can be generated from docker cp's arguments and added -t option.

@WhisperingChaos
Copy link
Contributor Author

@jlhawn

You can already get the desired behavior by simply creating a container using the image first.

Not always. Strictly speaking docker create doesn't fire On Build triggers and yes I understand that I would need to run docker build to first create an image using On Build triggers, then use docker create but given the near "equivalence" of containers to images, I thought it would be "easy" to implement this feature. It's also been discussed on stackoverflow.

Also, docker create requires an initiating process be defined (CMD/ENTRYPOINT) in order to create a container. Images aren't required to define an initiating process. In this situation, docker create issues the message: 'Error response from daemon: No command specified' preventing the creation of a container which aborts the copy process. Of course, there are methods to circumvent this too, however, I would suggest it's preferable to provide a simple interface that both directly communicates the ability to perform an image copy operation, as well as encapsulates/hides its implementation.

@WhisperingChaos
Copy link
Contributor Author

@duglin

This proposal, especially with the enhanced semantics permitting the docker cp to target images provides:

  • a means to separate build and run-time concerns when constructing new images, eliminating the pollution of the run-time image by build tool chain packages and intermediate build artifacts.
  • a composition mechanism to construct images to compliment the inheritance mechanism implemented by FROM.
  • a means to encode a custom build system employing docker images using, for example, a GNU makefile method.

@duglin
Copy link
Contributor

duglin commented Jan 22, 2016

I'm not hearing a lot of traction on this one. I think given the relatively easy solution that @jlhawn mentioned (just create a container - even if you have to give it a dummy command to run - and then use docker cp to pull the files from the container) I think we should close this one. If a new usecase is presented that made this solution totally unbearable for the user then I think we should reconsider it.

@duglin duglin closed this as completed Jan 22, 2016
@WhisperingChaos
Copy link
Contributor Author

Since I found it useful, the semantics of creating/evolving images by simply copying files were incorporated into a bash script called dkrcp that's available as a github project.

@mercuriete
Copy link

+1
Proposal: create new command called docker extract that only extract files from image to host.
The image dont mutate at all.

Use case: im am building my application with docker and then i wanted to extract the binaries to outside.

actual solution: create a dockerfile, build it, and then run a container only for a few seconds to extract with docker cp.

another proposal: add to COPY functionality to extract from image that is being builded to working folder in host.

@ndeloof
Copy link
Contributor

ndeloof commented Jun 14, 2016

I'm looking for this exact feature to implement a two phase docker build (see #7115):

  1. first Dockerfile will build from source, and as such include sources, test, and all sort of intermediate files
  2. second Dockerfile will package resulting binary in a clean image with runtime

to articulate those, I need to cp binary from image build on step 1. Ability to run docker cp image:path . would make this trivial. Today I have to create a container, run docker cp, then destroy container. Not a big deal, but introduce complexity

@WhisperingChaos
Copy link
Contributor Author

@mercuriete @ndeloof

Just in case it went unnoticed, there's a bash script named dkrcp available as Github project that wraps up several Docker CLI commands to implement this feature. It also supports multiple copy sources, an option to specify Dockerfile commands, and will gracefully terminate image creation, cleaning up running containers or deleting a newly created image from local repository, if the copy fails.

@mercuriete
Copy link

@WhisperingChaos
Thank you very much. i will have a look to this script.
I have exactly the same use case of @ndeloof .

  1. use docker to build java artifacts or another language binaries.
  2. include this binaries to a smaller runtime image.

Cheers!

@WhisperingChaos
Copy link
Contributor Author

@mercuriete
Your welcome.
If you find problems with the script, please generate a github issue. I just repaired problem with tilde expansion.

Enjoy!

@wedesoft
Copy link

As shown here you could run cat or tar:

docker run <image-name> tar -c -C /my/directory subfolder | tar x

@ndeloof
Copy link
Contributor

ndeloof commented Nov 15, 2016

@wedesoft this assumes the image has tar command, which one can't guaranteed.

@falnos24865
Copy link

Now that we have an official way to run Docker on a Raspberry Pi I wold like this to be re-opened. We have a use case where we have software running on PCs and RPis. Given that they have different architecture I CANNOT simply start a container using our image. However there are some files in the image I would like to extract so we do not have to create multiple (PC and RPi) versions of all of our containers.

@WhisperingChaos
Copy link
Contributor Author

@falnos24865

Unfortunately, one of my earlier posts suggested that dkrcp operated on running containers to implement the copy operation. It doesn't. Instead, it executes docker create to produce a container to act as a source and/or target for the copy command. docker cp doesn't require running containers in order to operate upon them. Once dkrcp completes, it will remove the temporary containers it created.

I've been using dkrcp with Docker 1.12, within an AMD64 Ubuntu VM to generate minimal images containing a golang executable targeted for the Pi.

Lastly, if you encounter a problem with dkrcp, let me know and I'll address it.

@ndeloof
Copy link
Contributor

ndeloof commented Jun 8, 2017

this feature is available when used to build a Dockerfile as multi-staged build, seems to me it could be exposed the the CLI as well.

@thaJeztah
Copy link
Member

@ndeloof see the discussion on #16079

@LLFourn
Copy link

LLFourn commented Jun 13, 2017

@thaJeztah you linked to the thread we are already in. I'm interested to see where you intended to link :)

@thaJeztah
Copy link
Member

Oh, lol, wrong link on my clipboard; meant to link to #30449

@cherrot
Copy link

cherrot commented Jul 5, 2017

docker run --rm <image-name> cat /path/to/file > file_from_image

I like this method :D

@ndeloof
Copy link
Contributor

ndeloof commented Jul 5, 2017

This assumes cat is available, which might not be the case :P
Please just consider this feature IS available from multi-staged builds, wonder why the API don't let us play with it.

@thaJeztah
Copy link
Member

No need to use cat;

docker create --name foo <image-name>
docker cp foo:/path/to/file file_from_image
docker rm foo

@rbi13
Copy link

rbi13 commented Jul 26, 2017

@binarytemple, I agree. Another way to do this is to simply piggy-back off of container IDs by dropping the --name option altogether:

IMG_ID=$(docker create <image-name>)
...

@binarytemple
Copy link

binarytemple commented Jul 28, 2017

@rbi13 - Thanks for the pointer, I just thought I'd share my Makefile workaround for building parameterized images and copying out build artifacts. It's pretty ugly (and completely untrustworthy (shell exploit anyone?)) , but works for now.

.phony: build run

DOCKER_CMD = $(shell which docker)
IMAGE_NAME = riak_rpm_builder
RPM_PATH=/root/riak/distdir/packages/
RPM_FILE=riak-${RIAK_VER}-1.el7.centos.x86_64.rpm

guard-%:
  @ if [ -z '${${*}}' ]; then echo 'Environment variable $* not set' && exit 1; fi

build: guard-RIAK_VER
  ${DOCKER_CMD} -D build --build-arg riak_ver=${RIAK_VER} -t riak_centos:${RIAK_VER} -f Dockerfile .

copy_out_rpm: guard-RIAK_VER build
  $(shell TMP_IMG=$(shell ${DOCKER_CMD} create ${IMAGE_NAME});${DOCKER_CMD} cp $$TMP_IMG:${RPM_PATH}${RPM_FILE} ./${RPM_FILE}; ${DOCKER_CMD} rm $$TMP_IMG)

@sbussard
Copy link

I'm concerned that this multi-step process is quite a learning curve for someone getting started with docker. It takes research. Everyone's doing it differently. Docker is already hard to learn.

Why not make it easier for newcomers? What's the downside?

I'm not convinced that the reasons for closing this ticket are adequate enough to discount the potential benefit to the community. I strongly advocate reopening this issue.

@Murmur
Copy link

Murmur commented Aug 30, 2017

My use case is for Tomcat8(Java J2EE server) base image modification for the new application. I would like to see a simple command to extract file from the image, modify file, use it in a dockerfile script file.

## mockup command use-case for clarity
## custom script modifies xml file at runtime such as adding missing jdbc datasources,
## everything else in xml config file is used as-is
## dockerfile inserts a modified tmp/server.xml to a new image
docker copy basetomcat8:latest /usr/local/tomcat/conf/server.xml /tmp/server.xml
./dothemagic.sh /tmp/server.xml
docker build -t myapp-test:1 ./dockertest

@WhisperingChaos
Copy link
Contributor Author

@Murmur

Docker's relatively new multi-stage build feature obsolesces the need for this enhancement when using docker build to construct a resultant image. Although I welcome the concept of multi-stage builds, I myself avoid using Docker's current implementation of this feature due to its technical flaws: pathological coupling and poor instruction orthogonality described in great detail by moby/buildkit#4246 and this comment. That said, if you can stomach these flaws, you can implement what you describe above using multi-stage build.

@asottile
Copy link
Contributor

asottile commented Oct 3, 2017

I was reaching for the same thing (in my case to add a userspace so I could docker exec and investigate a scratch + go binary container (with no userspace otherwise)) so I wrote up a little script to do this

It essentially does:

  • container_id = docker create $img
  • docker export -o tar.gz $container_id
  • tar -xf tar.gz
  • for toplevel in ls tar: docker cp $thing $container:/

@cfriedt
Copy link

cfriedt commented Oct 17, 2017

@Murmur

Multi-stage builds obsolete this feature for the purpose of copying in to an image (although it seems impossible to do so and also copy files to /). However, multi-stage builds definitely do not obsolete the need for this feature for copying out of an image. The image continues to be immutable for the purpose of copying a file out.

Note, 'docker export' also does not work for copying out of an image.

Considering the following file, 'hello.txt':

hi!

Consider the following Dockerfile:

FROM scratch
COPY hello.txt /

Build an image from the above Dockerfile.

$ docker build -t trivial .
Sending build context to Docker daemon  3.072kB
Step 1/2 : FROM scratch
 ---> 
Step 2/2 : COPY hello.txt /
 ---> 3a34db2e35d3
Removing intermediate container c84cd3c72f3a
Successfully built 3a34db2e35d3
Successfully tagged trivial:latest
# now remove hello.txt
rm hello.txt

How you you suggest one should extract the file hello.txt from the image (i.e. not a running container)?

$ docker images trivial
REPOSITORY          TAG                 IMAGE ID            CREATED             SIZE
trivial             latest              3a34db2e35d3        4 minutes ago       4B
$ docker cp 3a34db2e35d3:/hello.txt hello
Error response from daemon: No such container: 3a34db2e35d3

My solution was the following, which is a complete pain in the ass, and could potentially have unintended consequences. I create a stopped container from the image, which has an entry point of "/" (obviously running the container will fail). I then extract the file from the stopped container, and delete the container.

$ CONTAINER="$(docker create --name trivial-latest trivial:latest /)"
$ docker cp ${CONTAINER}:/hello.txt .
$ sudo docker rm ${CONTAINER}
$ cat hello.txt
hi!

IMHO, creating a stopped container to just to copy files out of an image is a waste of resources, but that is just my HO.

@thaJeztah
Copy link
Member

IMHO, creating a stopped container to just to copy files out of an image is a waste of resources, but that is just my HO.

Copying from an image directly likely would do the same, as the images filesystem still has to be mounted before being able to copy

@Murmur
Copy link

Murmur commented Oct 20, 2017

I use the following command to copy single file to a local disk, it creates a non-running container from the image file, copies a file and then delete the named temp container. This trick is not my discovery but here and also Stackoverflow helped me to do it.

$ docker create --name temp1 docker-registry.customer.com/repo/tomcat8-jre
$ docker ps --all
$ docker cp temp1:/usr/local/tomcat/conf/server.xml ./build/server.xml
$ docker rm temp1

Copying all files from the image I use similar command sequence.

$ docker create --name temp1 docker-registry.customer.com/repo/tomcat8-jre
$ mkdir tomcat8-jre
$ docker cp temp1:/  ./tomcat8-jre
$ docker rm temp1
$ docker ps --all

Not very intuitive way to achieve a simple task but once you know this black magic trick you move on to the next challenges in life.

@cfriedt
Copy link

cfriedt commented Oct 23, 2017

Extra steps. I guess it just means life is less convenient for users :P

@omkadiri
Copy link

I think it will make sense copying file to a volume. Use case can be a multi stage build that check out a code build it into an image and just like the COPY --from=build src dest functionality i should be able to do some like:

docker run -it -v <image-name>:/usr/src <other-image-name> command or in a docker-compose file

@AkihiroSuda
Copy link
Member

@duglin @thaJeztah @tonistiigi

I understand this can be done by combining docker create, docker cp, and docker rm, but can we consider reopening this for better UX?

If adding this feature is unacceptable for docker cp, I think we can implement as docker image blahblahblah which internally calls the equivalent of docker create; docker cp; docker rm.

@thaJeztah thaJeztah added this to backlog in maintainers-session Feb 6, 2018
@thaJeztah
Copy link
Member

@AkihiroSuda discussing with @tonistiigi - if we understand the use-case correctly, the primary use would be to copy build artefacts from an built-image to the host. Given that BuildKit would allow exporting the artefact directly on the host, perhaps we should have a look at that (i.e. integrating buildkit?)

@thaJeztah thaJeztah removed this from backlog in maintainers-session Feb 8, 2018
@AkihiroSuda
Copy link
Member

In addition to that, I think this proposal is useful for copying binaries from pulled images. WDYT?

@RXminuS
Copy link

RXminuS commented Mar 18, 2018

@thaJeztah I think so, my use-case would be to copy a static site out of an integration-tested image to submit to a CDN.

To elaborate: We use MultiStage builds to take a website -> minify etc. -> make a minimal nginx image with the static site. This image is used for integration testing. But then when that is done the actual hosting is done by a CDN. So I want to get the static site out of the image, zip the files and submit to the CDN. But I don't want to include the CDN API / curl in the "integration testing" container and run the command inside the container.

@sergey-litvinov
Copy link

I'll add one more use case for that. We use Docker multi-staged build to build ember app, and then host it under nginx container.

So we have two build steps. But we also running QUnit tests for ember app inside of first build image, and we want to save unit test results on actual host, so CI agent could pick up it, parse and display test results in CI's UI. docker create\cp\rm would help here, but if cp command can do it right away, it would make it easier.

@slavikme
Copy link

slavikme commented May 1, 2018

I agree, docker must support such feature, to copy file/folder from an image into host.
This feature is very useful and I use it a lot, especially for retrieving base configuration files from different services, modify them accordingly and then mount them to a container of the very same image.

So, instead of currently doing (as @binarytemple suggested):

IMG_ID=$(dd if=/dev/urandom bs=1k count=1 2> /dev/null | LC_CTYPE=C tr -cd "a-z0-9" | cut -c 1-22)
docker create --name ${IMG_ID} <image-name>
docker cp ${IMG_ID}:/path/to/file file_from_image
docker rm ${IMG_ID}

We need to have something like:

docker cpimage [docker-run-options] /etc/some-service.conf ./ <image-name>[:<tag>]

Or copying an entire folder:

docker cpimage [docker-run-options] /etc/some-service.d ./ <image-name>[:<tag>]

Basically, the following syntax could work:

docker cpimage [docker-run-options] <source-path-on-image> <dest-path-on-host> <image-name>[:<tag>]

@muxator
Copy link

muxator commented Oct 12, 2020

I agree with @slavikme.

My use case involves using Docker to build a static binary that is then run on bare metal. In this way, any linux distribution supporting docker can be used to build the application. I am currently resorting to using:

CID=$(docker create image-name:latest)
docker cp ${CID}:/opt/binary-name ./binary-name
docker rm ${CID}

This approach, however is not optimal: apart from requiring scripting, it is prone to subtle breakages, because the three commands sequence is not atomic, and performs side effects on the machine it runs on.

Having a single command that extracts one or more files from an image would really be a better solution.

@AkihiroSuda
Copy link
Member

@muxator docker build --output /some/dir works for you? It was added in Docker 19.03. Needs DOCKER_BUILDKIT=1 to be set.

@muxator
Copy link

muxator commented Oct 12, 2020

Hi @AkihiroSuda, thanks for the suggestion!

The Docker version on my ubuntu 20.04 is 19.03.8 and thus it is compatible with your hint. The documentation at https://docs.docker.com/engine/reference/commandline/build/#custom-build-outputs mentions it, too. This is probably a case of getting lost in old material through search engines.

For future reference, a minimal working example would be:

FROM ubuntu:20.04 as build-stage

# placeholder for commands that perform the build and put the
# statically linked binary in <release_path>/<binary_name>
RUN <build_commands>

# Copy just the statically linked artifact in the root directory
# of an empty container
FROM scratch
COPY --from=build-stage <release_path>/<binary_name> /<binary_name>

To run the build and put the generated binary in a directory, this command would then be sufficient:

DOCKER_BUILDKIT=1 docker build --output <dest_path> <dockerfile_path>

In general, that command would put the whole generated image contents in <dest_path>. But since the latest stage is a scratch image containing just one file, it results in putting just that file in the desired position.

Thank you very much.

p.s.: as a side note, it seems that the BUILDKIT subsystem is completely separated from the "normal" docker workflow. For example, the first time I run the build, docker had to re-run the whole Dockerfile (including preliminary package installs that supposedly where already in the cache)

@CodingInvoker
Copy link

I agree with @sbussard for 100%.

I'm concerned that this multi-step process is quite a learning curve for someone getting started with docker. It takes research. Everyone's doing it differently. Docker is already hard to learn.

Why not make it easier for newcomers? What's the downside?

I'm not convinced that the reasons for closing this ticket are adequate enough to discount the potential benefit to the community. I strongly advocate reopening this issue.

This issue was opened on Sep 4, 2015. Right now it's Mar 17, 2021 and the discussion is still ongoing. Maybe it's time we look at it seriously.

To put a summary(and plz correct me if I am wrong), essentially most of the purposed solutions are to

  1. Create/Run a container at first
  2. Grab the wanted file/data from that container
  3. Remove that container that you created/started in the first step after successfully copying the desired file.

This works for sure, which solved my problem to grab a json file inside an image. As a docker beginner, whenever there are some simple stuff that I assume I can find in official docker documentation, I will go there firstly.

If a docker beginner encounters the same issue, chances are they will check official docker documentation as well, so it would be really nice and user friendly when there are something like docker cp imageID myLocalPath --image, or something in parallel with docker cp like docker extract or docker cpi to copy stuff from image and they are clearly listed in the officially documentation.

Another question I am curious about is, is that pretty hard to make it available for copying file from docker image to local file system? Is that because comparing to docker container, docker image doesn't have a file system? But essentially docker image is just an immutable image file with multiple layers right, why can't we read data from it diretly?

@thaJeztah
Copy link
Member

This works for sure, which solved my problem to grab a json file inside an image. As a docker beginner, whenever there are some simple stuff that I assume I can find in official docker documentation, I will go there firstly.

Perhaps an example could be added to the docs; that said, it's not a very common scenario to copy files from an image (without also having a container related to that image)

But essentially docker image is just an immutable image file with multiple layers right, why can't we read data from it diretly?

Each of those layers has to be mounted to construct the final representation (e.g. if layer1 adds 2 files, layer2, removes one of those files, and layer3 updates on of those files, all three changes have to be applied on top of each other to know the final result); effectively, copying from an image would thus be the equivalent of docker container create <image>, docker cp (files from container), and docker container rm <temporary container>).

@wiktor-k
Copy link
Contributor

Perhaps an example could be added to the docs; that said, it's not a very common scenario to copy files from an image (without also having a container related to that image)

I guess it depends on the "not very common scenario" but I'm copying files from an image when I'm using docker to only build the app artifact (e.g. deb file) not to run the app. Thus I can have reproductive builds using Docker even when a customer is not using docker to run the app (believe it or not there are quite some companies that don't run systems on Docker).

@thaJeztah
Copy link
Member

I guess it depends on the "not very common scenario" but I'm copying files from an image when I'm using docker to only build the app artifact (e.g. deb file) not to run the app.

You may be interested in enabling buildit (DOCKER_BUILDKIT=1) and using the --output option (https://docs.docker.com/engine/reference/commandline/build/#custom-build-outputs)

Creating a project directory with some "source files"

$ mkdir -p myproject/src/build
$ cd myproject
$ touch src/source-file1 src/source-file2

A multi-stage Dockerfile to illustrate a "build-stage" that produces a my-binary artifact;

# Build-stage: builds "my-binary" from source
FROM busybox AS buildstage
COPY src src
RUN echo "I built this from src" > /src/build/my-binary

# Stage for --output; must only contain files to ship
FROM scratch AS artifacts
COPY --from=buildstage /src/build/my-binary /

Now, with buildkit enabled;

$ mkdir shipit
$ DOCKER_BUILDKIT=1 docker build --output=shipit .

$ ls -la shipit/
total 8
drwxr-xr-x  3 sebastiaan  staff   96 Mar 17 17:42 ./
drwxr-xr-x  5 sebastiaan  staff  160 Mar 17 17:42 ../
-rw-r--r--  1 sebastiaan  staff   22 Mar 17 17:42 my-binary

$ cat shipit/my-binary
I built this from source

No image is created in this example (as we didn't tag it), but the BuildKit build-cache will be used for repeated builds

@wiktor-k
Copy link
Contributor

Wow, this looks great! Thanks @thaJeztah!

@thaJeztah
Copy link
Member

The --output option can be really useful; if you want to have the option to either build as image or only have the artifacts, you can create a specific stage for the --output scenario, and a "regular" last stage, e.g. in the Dockerfile we use to build our documentation, we have a deploy-source stage for when we only need the artifacts to deploy, and the final stage contains both the source and an nginx server for local previewing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature Functionality or other elements that the project doesn't currently have. Features are new and shiny
Projects
None yet
Development

No branches or pull requests