-
Notifications
You must be signed in to change notification settings - Fork 18.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal: mount from image #30449
Comments
Slightly related; #16079. Wondering; would there be use-cases to have a writable layer on top of that mounted image? Would that (technically) be possible? |
This is huge!
Baby steps, please. It would also be excellent if you could pluck from paths in the image:
The syntax above is bogus, but this is the killer feature. |
@thaJeztah a writeable layer would be possible as a future change, but I think there are many advantages to starting with read only. @stevvooe adding paths should be easy, as it is just a different bind from the same image. Probably would want |
Thanks, just getting future options clear, absolutely no need to have r/w initially |
Agreed. A |
In case it's of interest, I submitted a proof of concept patch yesterday on another issue that does kind of similar. Instead of mounting into a container, it would allow a Dockerfile to kick off an independent build, and then snarf the final layer it created back into the current image build. Our use case is more around pulling a bunch of git code in, though the demo I put in the issue uses it as a build system. See: #14704 (comment) |
@aeijdenberg That is a very different use case that being able to mount an image. Mounting an image allows you to effectively have data images. This is very interesting for deploying projects that require a common app server, such as java, php, fastcgi or python (wsgi). It is also very interesting for distributing common configuration throughout a cluster of machines, which has been a some what of a pain point. Combined with something like If you are interested in a mounting a git repo, I do have a POC volume driver that could be whipped into shape, but I am not sure I have the bandwidth to support something like that. Please let me know. |
Hi @stevvooe , thanks for your reply. Yes, it likely is a bit of a different case - but since this issue came up a few times in my searches when I was searching for prior work, I figured there were enough keywords to make it worth putting a link in. What we're looking to do is slightly more complicated than just mounting a git repo - we do make some small changes on top of it during our image build process - which layers do feel quite well suited for. Just still figuring out the best practices for building them out efficiently. :) |
A small thing but something that bit me: Consistency isn't currently always guaranteed. Example: If you create a container image with some data exposed in a volume, when you run the container, the mtime of folders within the volume is set to when the container was run rather than when the container image was created. Logged in #17018. |
@tomfotherby I am not sure how that is at all related here. This proposal has nothing to do with volumes. |
@stevvooe - Sorry, Let me explain in more depth.
Lets say we launch 5 web-servers with the data volume mounted in via So until #17018 is resolved, this ticket wouldn't be 100% useful in my use-case. I understand that it's maybe a edge-case and safe to ignore. |
I believe (and please correct me if I'm wrong) what you're describing is something like the behaviour of
I suspect the implementation wouldn't have the same mtime issue, because it wouldn't be using volumes. |
@dnephin - Thanks for your explanation. Yes I was talking about |
I see this proposal as quite risky. I wouldn't want anyone to use it to include any code or other dependencies required for running the image. The real issue behind this seems to be the composability problems in the builder. The use case to share data is clearly very useful but is should be defined at build time. Having images in the hub that don't work unless the user specifies a specific mountpoint is a very bad UX. An image should define all of its dependencies behind its digest. If using the current builder is inefficient for this problem and doesn't allow maximum data reuse, it should be fixed instead. One (extreme) way would be to commit the digest references of the mounts into the image config. Then at least in the low level, this would be equal with the current proposal. I'd like to address the configuration loading separately. It is true that this could be used to load configuration as well, although I don't see it as the main use case. First, it wouldn't be simple - Docker image format is not designed for this, it would be quite hard for the user to generate these "configuration-images" or make modifications to them. In the same time, the "configuration-image" would pollute registry and |
@tonistiigi I don't see how this can cause any real problems.
This is the whole point: not everything is known at build time. This is not a build problem. Deployment is an emergent configuration of build time artifacts and environment. Mounting configuration from a data image is part of that environment. Ultimately, this is a runtime dependency, likely shipped from different teams. Making both teams to coordinate putting different configuration layers on top of each other is the problem (we also see this when understanding the use cases of pods). Time and time again have I seen users struggling with not wanting to push their own configuration on top of a registry image due to operational constraints, only to have them struggle with the distribution of content to make a bind mount.
Making this statement would be the same as saying docker image format isn't good for anything. Generating these images would be as simple as having a copy statement into a docker image. Is there something I am missing here?
This is a minor issue. We could introduce a special type of image for this, if this is a real problem, or we could have them cat the configuration by default or even apply a templating step. This is a longstanding ask in docker and really does cover a number of use cases. I think the fact that we have had volume plugins for some time but no solutions shows the approach is fundamentally flawed. Personally, I have had nothing but problems with docker volumes and in practice they have solved nothing (people still say "don't run containers with data"). If this were possible with volumes, we would already see plugins that exist, but they aren't there (I have a git implementation, but where are the others?). We can do better. These immutable mounts can share a massive piece of infrastructure with very little extra work. In fact, we already have the build, the distribution infrastructure, the file format and the command structure in mounts. Let's leverage this and stamp out this use case. |
Let's create this convention:
|
@stevvooe So the mount wouldn't work if it isn't based on that image? |
I guess it would be useful for non-config data (e.g. large scientific data split from the program container). How about |
I don't think we need that limitation. I was just thinking of a convention that we could use to differentiate. I'm just not understanding why plucking something out of the rootfs out of an image is such a problem. |
FYI: now we have |
I think it will make sense copying file to a volume. Use case can be a multi stage build that check out a code build it into an image and just like the COPY --from=build src dest functionality i should be able to do some like:
Or in a docker-compose file |
I am all for this proposal. What are the next steps? |
To me this is important to avoid copying this data from an image, but just directly mount it into a container. So once image is made and has files I would want to expose to another container, I would prefer to have only one copy of data around. |
FWIW; |
@thaJeztah True, but that could also be achieved via multistage build. |
@ChristianKniep sorry, perhaps my comment was unclear; I meant to say "mounting from an image is not unprecedented; we now allow it in |
I came across the same issue when I tried to deploy a PHP application with nginx, as nginx should serve assets directly, while all other requests get passed to the PHP container. I thought about the following options:
To solve this issue I created a volume plugin to mount an image layer as volume into other containers. This allows me to maintain just a single application image and mount the assets into a plain Another use case for this plugin is to mount configuration images created from |
Is the problem still there? |
Yes, this has not yet been implemented (but contributions welcome!) |
I would love this for adding debugging tools to an existing container at runtime, such as adding busybox to a single-binary container. I normally do this with named volumes, but this would provide a sort of "globally named volume". |
Chiming in on this with other use cases… I’ve had several customer calls lately where folks are interested in distributing large files (various datasets, squashfs images, etc.). Using OCI artifacts/images makes distribution and caching easy, but getting the data into containers is challenging. Sure, you could either do a multi-stage build to combine the image with the large files or populate a volume. But, you’re also then using that much more disk space with the copy. Being able to mount it directly as a read-only mount would be incredibly useful. I imagine as OCI artifacts start to be used more often to distribute config, etc., this use case will continue to rise. |
There are many cases where it is useful to mount an immutable data volume into a container, with configuration or static data (or code, it could be an apt repo or whatever).
I am proposing we add a new mount type so you can do (or whatever
--mount
is renamed to) - it would also be supported in compose files/bundles.docker service create --mount type=image,src=alpine:3.5,dest=/alpine ...
These would fetch the image from the repository if necessary, allowing tags or hashes to be specified, and mount it at the specified mountpoint. This would always be read only. Docker would unpack the image and mount it, presumably using layers but this would be an implementation detail, as being read only this would not be visible.
Use cases include configuration data (not secrets, but miscellaneous scripts and config), actual data, apt repos, npm repos etc. If a hash was used it would be guaranteed to be consistent across multiple tasks in a service.
cc @AkihiroSuda @tonistiigi @stevvooe
The text was updated successfully, but these errors were encountered: