Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: mount from image #30449

Open
justincormack opened this issue Jan 25, 2017 · 32 comments
Open

Proposal: mount from image #30449

justincormack opened this issue Jan 25, 2017 · 32 comments
Labels
area/volumes kind/enhancement Enhancements are not bugs or new features but can improve usability or performance.

Comments

@justincormack
Copy link
Contributor

There are many cases where it is useful to mount an immutable data volume into a container, with configuration or static data (or code, it could be an apt repo or whatever).

I am proposing we add a new mount type so you can do (or whatever --mount is renamed to) - it would also be supported in compose files/bundles.

docker service create --mount type=image,src=alpine:3.5,dest=/alpine ...

These would fetch the image from the repository if necessary, allowing tags or hashes to be specified, and mount it at the specified mountpoint. This would always be read only. Docker would unpack the image and mount it, presumably using layers but this would be an implementation detail, as being read only this would not be visible.

Use cases include configuration data (not secrets, but miscellaneous scripts and config), actual data, apt repos, npm repos etc. If a hash was used it would be guaranteed to be consistent across multiple tasks in a service.

cc @AkihiroSuda @tonistiigi @stevvooe

@justincormack justincormack added area/volumes kind/enhancement Enhancements are not bugs or new features but can improve usability or performance. labels Jan 25, 2017
@thaJeztah
Copy link
Member

Slightly related; #16079. Wondering; would there be use-cases to have a writable layer on top of that mounted image? Would that (technically) be possible?

@stevvooe
Copy link
Contributor

This is huge!

Wondering; would there be use-cases to have a writable layer on top of that mounted image? Would that (technically) be possible?

Baby steps, please.

It would also be excellent if you could pluck from paths in the image:

docker service create --mount type=image,src=alpine:3.5/theconfig.cfg,dest=/etc/myapp.cfg ...

The syntax above is bogus, but this is the killer feature.

@justincormack
Copy link
Contributor Author

@thaJeztah a writeable layer would be possible as a future change, but I think there are many advantages to starting with read only.

@stevvooe adding paths should be easy, as it is just a different bind from the same image. Probably would want path=... so less microformatty.

@thaJeztah
Copy link
Member

Thanks, just getting future options clear, absolutely no need to have r/w initially

@stevvooe
Copy link
Contributor

@stevvooe adding paths should be easy, as it is just a different bind from the same image. Probably would want path=... so less microformatty.

Agreed. A path parameter would make that a whole lot less awful.

@aeijdenberg
Copy link
Contributor

In case it's of interest, I submitted a proof of concept patch yesterday on another issue that does kind of similar. Instead of mounting into a container, it would allow a Dockerfile to kick off an independent build, and then snarf the final layer it created back into the current image build. Our use case is more around pulling a bunch of git code in, though the demo I put in the issue uses it as a build system.

See: #14704 (comment)

@stevvooe
Copy link
Contributor

@aeijdenberg That is a very different use case that being able to mount an image. Mounting an image allows you to effectively have data images. This is very interesting for deploying projects that require a common app server, such as java, php, fastcgi or python (wsgi). It is also very interesting for distributing common configuration throughout a cluster of machines, which has been a some what of a pain point. Combined with something like PIVOT, you could also construct build harnesses.

If you are interested in a mounting a git repo, I do have a POC volume driver that could be whipped into shape, but I am not sure I have the bandwidth to support something like that. Please let me know.

@aeijdenberg
Copy link
Contributor

Hi @stevvooe , thanks for your reply. Yes, it likely is a bit of a different case - but since this issue came up a few times in my searches when I was searching for prior work, I figured there were enough keywords to make it worth putting a link in.

What we're looking to do is slightly more complicated than just mounting a git repo - we do make some small changes on top of it during our image build process - which layers do feel quite well suited for. Just still figuring out the best practices for building them out efficiently. :)

@tomfotherby
Copy link
Contributor

If a hash was used it would be guaranteed to be consistent across multiple tasks in a service.

A small thing but something that bit me: Consistency isn't currently always guaranteed. Example: If you create a container image with some data exposed in a volume, when you run the container, the mtime of folders within the volume is set to when the container was run rather than when the container image was created. Logged in #17018.

@stevvooe
Copy link
Contributor

stevvooe commented Feb 4, 2017

@tomfotherby I am not sure how that is at all related here. This proposal has nothing to do with volumes.

@tomfotherby
Copy link
Contributor

tomfotherby commented Feb 6, 2017

@stevvooe - Sorry, Let me explain in more depth.
Lets say this issue is implemented and the immutable data volume that we want to mount into a "web-server container" is website code. The data-container that we are going to mount is built like this:

FROM scratch
COPY . /var/www
VOLUME ["/var/www"]
ENTRYPOINT ["true"]

Lets say we launch 5 web-servers with the data volume mounted in via --volumes-from and they start slowly maybe 1s apart. The problem is the the mtime of the folders in /var/www volume will not be consistent across all web-servers. This matters in some situations for example, some web-frameworks build javascript bundles with a filename that is a hash of the contents of the folder, including mtime, meaning each web-server will generate a differently-named bundle and this can break the website.

So until #17018 is resolved, this ticket wouldn't be 100% useful in my use-case. I understand that it's maybe a edge-case and safe to ignore.

@dnephin
Copy link
Member

dnephin commented Feb 6, 2017

I believe (and please correct me if I'm wrong) what you're describing is something like the behaviour of --volumes-from. I think this proposal would ignore any volumes in the image. So the data image would be something like this:

FROM scratch
COPY . /

I suspect the implementation wouldn't have the same mtime issue, because it wouldn't be using volumes.

@tomfotherby
Copy link
Contributor

@dnephin - Thanks for your explanation. Yes I was talking about --volumes-from. It's great this implementation wouldn't be using volumes.

@tonistiigi
Copy link
Member

I see this proposal as quite risky. I wouldn't want anyone to use it to include any code or other dependencies required for running the image.

The real issue behind this seems to be the composability problems in the builder. The use case to share data is clearly very useful but is should be defined at build time. Having images in the hub that don't work unless the user specifies a specific mountpoint is a very bad UX. An image should define all of its dependencies behind its digest.

If using the current builder is inefficient for this problem and doesn't allow maximum data reuse, it should be fixed instead. One (extreme) way would be to commit the digest references of the mounts into the image config. Then at least in the low level, this would be equal with the current proposal.

I'd like to address the configuration loading separately. It is true that this could be used to load configuration as well, although I don't see it as the main use case. First, it wouldn't be simple - Docker image format is not designed for this, it would be quite hard for the user to generate these "configuration-images" or make modifications to them. In the same time, the "configuration-image" would pollute registry and docker images with images that can't be run. I do see the benefits of immutable volumes for loading configuration. I have nothing against loading files directly from github or having a way to create snapshot volumes based on local data that would be available to services. But at the same time, I don't want to expose many new problems.

@stevvooe
Copy link
Contributor

@tonistiigi I don't see how this can cause any real problems.

The use case to share data is clearly very useful but is should be defined at build time.

This is the whole point: not everything is known at build time. This is not a build problem. Deployment is an emergent configuration of build time artifacts and environment. Mounting configuration from a data image is part of that environment. Ultimately, this is a runtime dependency, likely shipped from different teams. Making both teams to coordinate putting different configuration layers on top of each other is the problem (we also see this when understanding the use cases of pods). Time and time again have I seen users struggling with not wanting to push their own configuration on top of a registry image due to operational constraints, only to have them struggle with the distribution of content to make a bind mount.

First, it wouldn't be simple - Docker image format is not designed for this, it would be quite hard for the user to generate these "configuration-images" or make modifications to them.

Making this statement would be the same as saying docker image format isn't good for anything. Generating these images would be as simple as having a copy statement into a docker image. Is there something I am missing here?

In the same time, the "configuration-image" would pollute registry and docker images with images that can't be run.

This is a minor issue. We could introduce a special type of image for this, if this is a real problem, or we could have them cat the configuration by default or even apply a templating step.

This is a longstanding ask in docker and really does cover a number of use cases. I think the fact that we have had volume plugins for some time but no solutions shows the approach is fundamentally flawed. Personally, I have had nothing but problems with docker volumes and in practice they have solved nothing (people still say "don't run containers with data"). If this were possible with volumes, we would already see plugins that exist, but they aren't there (I have a git implementation, but where are the others?). We can do better.

These immutable mounts can share a massive piece of infrastructure with very little extra work. In fact, we already have the build, the distribution infrastructure, the file format and the command structure in mounts. Let's leverage this and stamp out this use case.

@stevvooe
Copy link
Contributor

Let's create this convention:

FROM config
COPY . /

@tonistiigi
Copy link
Member

@stevvooe So the mount wouldn't work if it isn't based on that image?

@AkihiroSuda
Copy link
Member

FROM config

I guess it would be useful for non-config data (e.g. large scientific data split from the program container).
So I wonder calling it config can be confusing.

How about FROM scratch (from anything should be ok) and some label like LABEL com.docker.image.unrunnable?

@stevvooe
Copy link
Contributor

@stevvooe So the mount wouldn't work if it isn't based on that image?

I don't think we need that limitation. I was just thinking of a convention that we could use to differentiate.

I'm just not understanding why plucking something out of the rootfs out of an image is such a problem.

@AkihiroSuda
Copy link
Member

FYI: now we have docker run --mount type=..., and it would be easy to add new mount type

#32251

@omkadiri
Copy link

I think it will make sense copying file to a volume. Use case can be a multi stage build that check out a code build it into an image and just like the COPY --from=build src dest functionality i should be able to do some like:

docker run -it -v <image-name>:/usr/src <other-image-name> command or
docker run --mount type=image src=imageName dest=/usr/src <other-image-name command

Or in a docker-compose file

@ChristianKniep
Copy link

I am all for this proposal. What are the next steps?

@mitar
Copy link

mitar commented May 21, 2018

To me this is important to avoid copying this data from an image, but just directly mount it into a container. So once image is made and has files I would want to expose to another container, I would prefer to have only one copy of data around.

@thaJeztah
Copy link
Member

This would probably be interesting in combination with #39041 (#32582) as well; for example; a service's image that has static files; allowing another service to mount the subdirectory containing the static files from that image

@thaJeztah
Copy link
Member

FWIW; docker build allows this when using BuildKit (DOCKER_BUILDKIT=1) using the experimental RUN --mount syntax (allowing you to mount an image)

@ChristianKniep
Copy link

@thaJeztah True, but that could also be achieved via multistage build.
The neat thing having a volume that was populated via image is already stated: reproducibility :)

@thaJeztah
Copy link
Member

@ChristianKniep sorry, perhaps my comment was unclear; I meant to say "mounting from an image is not unprecedented; we now allow it in docker build, so might as well support it on docker run as well"

@alehaa
Copy link

alehaa commented May 1, 2020

I came across the same issue when I tried to deploy a PHP application with nginx, as nginx should serve assets directly, while all other requests get passed to the PHP container. I thought about the following options:

  1. Build two images for nginx and php-fpm including the necessary sources. While the PHP image would include not only the sources, but also required dependencies, the nginx image wouldn't have any extras except the same sources. In this scenario I need to maintain two images and can't benefit from fast deployed security fixes upstream in the official images.

  2. One could checkout the compiled sources and assets with any other tool and mount them into both containers. Let's assume you like to deploy a simple LDAP phonebook: First anyone who has this application deployed needs to manually update and track updates which doesn't scale. Second, this scenario needs a specialized PHP container with its dependencies installed, too. In addition, the sources and Docker image can't be deployed together anymore, so there might be inconsistencies e.g. when PHP 7.3 code is run in a 7.0 image.

To solve this issue I created a volume plugin to mount an image layer as volume into other containers. This allows me to maintain just a single application image and mount the assets into a plain nginx container. It works, but has some limitations, most notably that the image is not associated with the running container and thus can be deleted any time.

Another use case for this plugin is to mount configuration images created from scratch. For example the nginx container could use custom error pages, which can easily be deployed and updated independent of nginx, i.e. without interfering their update schedule including security fixes.

@shivanisarthi
Copy link

Is the problem still there?

@thaJeztah
Copy link
Member

Yes, this has not yet been implemented (but contributions welcome!)

@actualben
Copy link

I would love this for adding debugging tools to an existing container at runtime, such as adding busybox to a single-binary container. I normally do this with named volumes, but this would provide a sort of "globally named volume".

@mikesir87
Copy link
Contributor

Chiming in on this with other use cases… I’ve had several customer calls lately where folks are interested in distributing large files (various datasets, squashfs images, etc.). Using OCI artifacts/images makes distribution and caching easy, but getting the data into containers is challenging.

Sure, you could either do a multi-stage build to combine the image with the large files or populate a volume. But, you’re also then using that much more disk space with the copy. Being able to mount it directly as a read-only mount would be incredibly useful. I imagine as OCI artifacts start to be used more often to distribute config, etc., this use case will continue to rise.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/volumes kind/enhancement Enhancements are not bugs or new features but can improve usability or performance.
Projects
maintainers-session
  
Awaiting triage
Development

No branches or pull requests