New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal: Nested builds #7115
Comments
I asked if we could invert the syntax and achieve the same function - and after lots of IRC discussion I think the answer is not really. This Proposal has some interesting possible effects that we should list:
some of these may be bad, some may just need more info in the proposal :) |
Hm, @shykes version makes more technical sense where-as @SvenDowideit's version seems more logical. I'm +1 for @SvenDowideit's version. |
+1 Having the ability to inject build/test dependencies and discard them at publishing time would simplify a lot for our docker build/release pipelines. |
It would also potentially make the final images much smaller :)
|
Can someone elaborate where/how layer caching would work into either use-case, from the stated goal of trying to minimize overall size, is the inner buildfile cached completely as a separate container, and only the result is added to the parent layer? The build process is typically the most time consuming, and benefits the most from caching. |
I'm not sure if the context needs to be implicitly added/bound in the inner image fs (this could maybe be introduced later and separately from this proposal). I deleted my earlier syntax change suggestion and created a separate proposal to discuss a more explicit way to bind the context, as per IRC discussion, see #7149. |
Guys I ask that you focus on criticizing the proposal instead of pushing completely different proposals in the comments. By all means create a separate issue if you have a proposal of your own! Thanks. |
@shykes, agreed switching to constructive critism mode.
Please specify which subset (are |
I didn't mean a subset of available instructions (all instructions should be available), but a subset of the Dockerfile content - in other words, whatever is enclosed in the curly braces. Happy to change the wording to something more clear.
The source context would be the same in all images. In other words, |
@shykes, thanks I suggest adding this to your original proposal description, as those were the first question I had while reading it. |
What happens if a file exists in both the anchored directory and the fs of the base image used in the |
I have a possible use case for nested builds which doesn't seem to be covered (yet) by this proposal. In some cases, the information written into a Dockerfile is duplicated information from an existing build system, which could have been auto-generated instead. It would be nice if (optionally) the nested build would look for a Dockerfile at the root of the filesystem for the nested build, at that point in the build process. This means that previous steps could generate the Dockerfile and build context used to create the image. More concretely, http://www.scala-sbt.org/sbt-native-packager/DetailedTopics/docker.html#tasks shows an example where a build system can create a Dockerfile and context, ready to use with Docker. One example implementation here could be to look for a second Dockerfile if |
@shykes after looking over this proposal, it satisfies the use-case that #4933 was targeting. Also, to further this functionality, the path argument to Another topic, how will this relationship be tracked with the image metadata stored? will the |
@proppy updated |
@shykes on irc, you mentioned the possibility of having more than one Similarly, can you define what happens with multiple OH - and nesting. can I have an IN inside an IN, and how deep, and can I have a I'm curious how Can we define what happens when the I'm thinking I could use this as a build pipeline for boot2docker, with the final PUBLISHed image containing the |
I very much like (and need) this functionality. My main comment is that when I first read the Dockerfile, I didn't understand what was going on. It took me a bit to get it. So a couple comments
A final general comment is how are we going to layer the inner context. It seems that with each ADD or RUN command in the outer Dockerfile context you could be modifying the contents of /var/build. So (assuming were bind mounting /var/build) you would need to create a new layer for the parent context and then all inner contexts for every Dockerfile directive. It seems the implementation of this could be messy. It would be cleaner to implement if we explicitly knew for each Dockerfile invocation if it was going to modify the one of the contexts. For example, the below syntax would be easier to implement IMO, but it is uglier.
|
I can see how the proposal elegantly solves the issue of complex build workflows, but don't you fear it'll be misused as a mean to "sum" images? For example in:
Perhaps |
Honestly, I see that use as a cool bonus feature, especially since the two |
@tianon i don't think this needs to be implemented by actually coping the contents of the inner build to the outer layer. Instead setup two rootfs directoies for outer and inner context and mount the inner in to the /var/build. This will mean if you don't publish the inner context the resulting image will have none of the contents of the inner context because it was bind mounted. This approach also mean this feature would not be able to "sum" up a bunch of images (which is not something we want to allow). |
just to note - I would like to be able to sum up a bunch of images. Doing so makes Docker interesting from a 'replacement for packages' perspective. its basically making a way to turn off (or make a shared space in) the FS namespace. so @ibuildthecloud @icecrime could you perhaps expand on your opinion - as it doesn't sound like we all have the same fear of doing it :) |
@SvenDowideit I can't say I'm totally opposed to it in general, but it is a separate topic. This proposal is to address the very real issue of separating your build and runtime environments in an elegant way. Anytime a new feature is proposed you must consider how it might be used in some unexpected way and what is that impact. Allowing one to sum up a bunch of images will fundamentally change the nature of images. As you indicated, you move from an image essentially being a "full OS image" to an image being a "package." If we were to go in this direction we will need to invent new concepts and technology to describe, manage, and create images. At this point in time I don't think it would be helpful to bifurcate the nascent image ecosystem. Instead we should focus on the specific issue at hand and not focus on changing the nature of images. |
@SvenDowideit Don't give my opinion too much credit, I'm a beginner with Docker ;-) TBH I'm not sure I understand how the 'replacement for packages' perspective relates to images combination. I just have the impression that "how can I get both X and Y in my Docker image" is a recurring beginner question (that I've been asking myself): there's no easy way to do this today, which is probably a good thing as it encourages the "one process for one container" approach. To sum up: using |
What makes me inconfortable with the proposal in its current form is the tight coupling between the inner instructions and the outer ones. In the example of the description, the outer And the inner instructions don't need to This create a model where the set of inner Dockerfile instructions and outer ones are unlikely to be composable across images, even more so if this is combined later with something like With the existing build model this is nicely abstracted by the context notion, and some docker users already compose builds today, by chaining multiple Maybe the description could expand a little more on the methods used today, and which tradeoffs (if any) the proposal has to make to simplify and improve them. |
Also, would be really interested in seeing something like this or some variant. From my understanding, it would be very helpful for testing a layer without including artifacts from testing in the final tagged commit. |
+1 , would like this feature. really useful. |
+1 for docker multiple inheritance functionality |
👍 |
Does this https://github.com/docker/docker/blob/master/ROADMAP.md#22-dockerfile-syntax mean this proposal won't be implemented any time soon? (if ever) |
The proposal need to split so that each can be discussed and closed independently. |
👍 |
Hi, I'm working on a small tool. It can build small docker image with multiple steps. Some may feel it useful. |
have a look at https://github.com/6si/shipwright |
Since we're posting utilities, I've built one to build minimal Golang images in two steps based on the scratch image |
I think multiple inheritance is a bad idea, see diamond problem, but composable traits a good one, I wrote on the multiple inheritance ticket how I think it could be accomplished safely syntactically. that said glancing at this or the issue that I'm interested in is a syntactic sugar around temporary build layers for multiple && commands for example this nasty piece of code
the rpm command is actually expensive and takes a while when doing build, so if if something fails after it (while developing the image) I have to do the whole thing again. What'd be nice is a way to denote layers that are to be flattened, in the final build.
or something like that, where if say the tar failed (because I typoed the path) I wouldn't necessarily have to run the curl again, while developing the file. In the final image these would just look like one layer. |
Given that we have multistage build now, can we update the status of this "roadmap" issue? |
Thanks for the ping @AkihiroSuda . Let's close this as #32063 that addresses this problem is merged. |
thanks you very much |
Some images require not just one base image, but the contents of multiple base images to be combined as part of the build process. A common example is an image with an elaborate build environment (base image #1), but a minimal runtime environment (base image #2) on top of which is added the binary output of the build (typically a very small set of binaries and libraries, or even a single static binary). See for example "create lightweight containers with buildroot" and "create the smallest possible container"
1. New Dockerfile keywords: IN and PUBLISH
IN
defines a scope in which a subset of a Dockerfile can be executed. The scope is like a new build, nested within the primary build. It is anchored in a directory of the primary build. For example:PUBLISH
changes the path of the filesystem tree to use as the root of the image at the end of the build. The default value is/
(eg. "publish the entire filesystem tree"). If it is set to eg./foo/bar
, then the contents of/foo/bar
is published as the root filesystem of the image. All filesystem contents outside of that directory are discarded at the end of the build.Behavior of RUN
When executing a
RUN
command in an inner build, the runtime uses the inner build directory as the sandbox to execute the command. So for example:IN /foo { touch /hello.txt }
will create /foo/hello.txt.Behavior of ADD
When executing
ADD
in an inner build, the original source context does not change. In other words,ADD . /dest
will always result in the same content being copied, regardless of where in the Dockerfile it is invoked. Note: the destination of theADD
will change in a nested build, since the destination path is scoped to the current inner build.The outer build can access the inner build
Note that filesystem changes caused by the inner build are visible from the outer build. For example,
/usr/local/bin
was created byFROM busybox
and is therefore accessible to the final RUN command in the build.Behavior of PUBLISH
Also note that
PUBLISH /var/build
causes the result of the inner build (the busybox image) to be published. Everything else (including the outer Ubuntu-based build environment) is discarded and not included in the image.The text was updated successfully, but these errors were encountered: