New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal: Dockerfile BUILD instruction #7149
Comments
Wait, so each |
@cyphar I updated the description and cmmented below to answer your question. Let me know if there is still some questions.
Yes, a subdirectory of a container being built. Updated the description.
The outer build container should be unaffected by the inner ones, much like a
Yes, updated the description. |
Does that mean that the context is executed in a container with the directory of the context as the root? How do we deal with binaries in that case? If not, what will happen if someone modifies the outside directory? Is it not committed or is that undefined? |
The context directory gets tared before running the inner build, just like I updated the description, let me know if it still isn't clear. |
What's the difference with #7115 other than renaming IN to BUILD? |
@shykes in #7115 the first argument to I will clarify the Description to point out the main difference between #7115 and this one at the top. |
I see. Thanks for clarifying. What about image naming? In your example above, what would be the resulting images? |
@shykes (see the notes at the end of the proposal) it wouldn't be different from multiple
|
I would love to see this. It would make it so much easier to run builds where the build environment is one step, and the resulting image you generate could not include all the build artifacts. |
I like this proposal very much. I would even more simplify it and lose the curly braces so that you can switch to a next build step but never come back. I think that its not a good idea to support support multiple build blocks in the same level. Goal of a Dockerfile should be to build a single target image, others should just be helpers. For the tagging question, I think "last build wins tag" is a good solution. If we would use the linear syntax then the parent builder image could be auto tagged like @shykes I'd be happy to attempt an implementation of this if any of the project maintainers thinks this has a chance end up in upstream. |
Looks good to me (from a functional point of view, didn't checked the code) I also wonder I'm not convince about the tagging model "last win". Maybe add an (optional) 2nd parameter to |
I'm wondering if you have examples that demonstrates the need for multiple non-nested BUILD commands inside a Dockerfile. |
@dgageot sample use-case : "build" Dockerfile to download oracle JDK, remote unnecessary stuff (docs, demos, src) then bootstrap a set of "production" Dockerfile for various linux distros For sure this can also be addressed using branches. Anyway I'd prefer the syntax doesn't enforce some restriction we might found annoying later, until there some technical constraints to enforce it. |
@ndeloof To be really useful your scenario would need to tag every image, not just the last one. Wouldn't it? |
yes, so my request for a 2nd argument to define the tag suffis, so I could |
@ndeloof Hi. Hope you've seen #8021. Is there any use of shrinking the images in your use case or are you just mainly looking for a way to build multiple services using a single Dockerfile? The layer cache across Dockerfiles and including your own (project) base images still work with this proposal. |
The proposed design lacks a complete interface element. An interface provides a coupling point at the boundary layer that separates the internals of a function, its implementation, from the surrounding external invocation environment. At this coupling point, an interface features a mechanism to bind one or more external arguments to a corresponding set of variables internal to the function. This binding mechanism allows external argument names to be properly correlated and values transferred, even if their names are different, to their internal counterparts. Without this binding mechanism, all instances of function invocations must synchronize (couple) argument names/instances to the variables internal to the function. It’s this correlated binding mechanism that’s missing resulting in harmful coupling and its various effects.
Suggest review of this proposal: 8660. |
Any update on any solution for BUILD or similar? |
Seems no BUILD instruction or nested build implementation is coming any time soon. Would make source based containers and minimal containers much easier to build and even test. |
@tiborvass I like @tonistiigi implementation in #8021 (without the nesting) better. I'd say that the ENV is not inherited, ENV are constant anyway so if you need them in a later |
@proppy yes, I keep his implementation in mind. I really like it. I'll try to push it on. |
@proppy would you mind updating your proposal to match his implementation? That way we'll be able to approve it here. |
@tiborvass I'm happy to update the proposal to match #8021, would that help to move things forwards? |
@proppy jynx :) |
Another thing to think about, is what should |
@tiborvass Done, re: intermediate image there is a note about that
|
/ping @tonistiigi can you proof read the updated version and make sure I didn't miss-represent your implementation? |
This one confuses me a bit. You can never go back to the parent build, so there is no way to change anything in
Parent tagging is also different but I'm ok with leaving this out until there is an idea that everybody can agree upon. |
@tonistiigi you're right, that was a left over from the previous proposal. Updated the description.
Added: "if instructions follow the BUILD command and path/to/Dockerfile is set or /Dockerfile exists, the build is failed."
I think we could keep it simple and dodge parent tagging for now? It's not core to the proposal and can always be introduced later, what do you think? |
I agree. |
Given one of my projects is doing a "building everything from source" kind of thing, I have some ideas I wanted to get out.
First I like the general notion of nested builds. Furthermore using predefined dockerfiles for a given nested build simplifies code deduplication and if that's possible would make pluggable docker compositions a much easier thing to do.
So instead of injecting versions into an abstracted build one would just version the actual dockerfiles. This would enable more reproducable builds and makes upgrades easier. On the other hand this pushes versions into the dockerfile. To update to a new version one doesn't just have to bump the version number, but has to create a new dockerfile (code duplication probably) and reference this in the nested build. Minimal images build completely from source, without the need of packagers (only dockerers) here we come. |
strictly from a syntax perspective, {}'s will probably more complex to support given } could be a valid thing to appear in a command. I think doing something like:
would be less disruptive and less chance of causing issues. |
Note that updated proposal based on #8021 implementation doesn't really support 'nested build', but rather 'chained build', i.e: the new |
Chained Build's entangled coupling creates artificial dependencies between links (build steps) in the chain, resulting in following build time deficiencies when compared to better encapsulated solutions:
|
A Nested/Chained Build inescapably propagates an existing vulnerability that can be exploited directly/indirectly by malware to pilfer secret/confidential/sensitive content required to configure an image. Secrets, such as, a software license key or an unlocked binary required to construct or be incorporated into the resultant image, that are provided by the initial build context or later added to a successor one become a resource available through a step's build context. Since a given step's nested/chained build context must include what's required by the current and convey artifacts already specified/generated for all subsequent build steps, the context provided to any given intermediate step, typically represents a superset of artifacts required to satisfy its needs. Therefore, this entire superset of resources becomes readable to any process running within an intermediate step. For example, if a third build step requires a secret key supplied by the initiating build context, this key must be conveyed via the build contexts of the first and second build steps, revealing the key's value to processes executing in these steps. Combine this required transfer with the semantics of "adaptive operators", static coding forms that automatically/naturally extend themselves, like the Dockerfile "ADD . /gopath/src/app/" (see google / golang-runtime @proppy) which silently and spontaneously extends itself to include all resources defined by the build context and one can imagine intentionally copying the current build context to a remote adversarial host or unintentionally, when an expected payload inadvertently includes some/all of the superset resources. Again, Nested/Chained Build's weak encapsulation and tight coupling prevent its currently proposed implementation from presenting a step with the minimal interface/attack surface required to perform its objective. Again, a more strongly encapsulated and weakly coupled solution that offers a mechanism to explicitly enumerate the resources provided to a particular build step, must be implemented to avoid the exploit. Exploit Based on VulnerabilityWhat's provided below is a simple, working exploit involving a secret key included in an image's build context. Although this section focuses on illustrating a security exploit based on the vulnerability, this vulnerability also affects an image's build or runtime stability, as the build context will, while runtime context might contain additional, unexpected artifacts that may cause a process to fail that would have otherwise succeeded. Base Image ExploitCreate a base image incorporating a trigger to copy the build context via an "adaptive operator", in this case, Dockerfile "ADD .", to a directory within the image's file system. A subsequent trigger statement then executes a transfer mechanism, in this instance implemented as rsync, to deliver the entire build context to a remote server (bfosberry/rsync). "compile.sh" obfuscates the rsync transfer mechanism. As indicated above, the motivation for the transfer can be either malicious or unintentional. For example, compile.sh may represent a benignly encoded process that transfers "source code" to a remote compiler which generates output executables that are returned and included in the resulting image. However, regardless of the motivation, this encoding exposes the secret key. Build Context/ Base Image Dockerfile
Derived ImageInherits from base image constructed above. Build Context/ Derived Image's Dockerfile
Build the derived image. Notice the generated rsync messages include copying the secret key and Dockerfile in addition to the go source code. The secret is now public. Also, SecretKey.txt and Dockerfile artifacts will participate in the compilation process. If considered by the compiler, these files may result in the artificial failure of what would have been a successful compile.
A simple redirection of rsync output further obfuscates the exploit by eliminating the rsync transfer messages, thereby, minimizing its detection.
As alluded to above, the vulnerability already exists for Dockerfiles, especially those implementing multiple FROM statements, as no mechanism currently exists to limit any single FROM image's consumption of the entire initial build context. However, the vulnerability can currently be mitigated by
|
TLTR;
BUILD /rootfs/path/to/context
==RUN docker build /rootfs/path/to/context
This was originally a fork of proposal #7115, as I didn't want to pollute the main discussion.
Suggested changes were:
IN
scopes and populates the rootfs of the inner build, and the context are shared w/ the outer build. With this proposal you explicitly pass a new context to the inner build, and you have toADD
orCOPY
file from the context to the image, just like a regulardocker build
.Since then @tonistiigi made a simpler PoC in #8021, this proposal was then updated to match the PR.
Proposal
Most application that build from sources have a disjoint set of build and runtime dependencies/requirements. Because of convenience and in order to benefit from automated build, most images bundle both in their docker images.
A common example is a go application, where the go toolchain and possibly a C compiler is necessary to build the app, but
busybox
is totally fine to run it.Another example is a java application, where the JDK is necessary to build the app, but the JRE is enough to run it.
This proposal derives the main ideas from #7115 and #8021 to introduce a simple syntax for allowing chained build with explicit contexts. The indent is to generalize and consolidate workarounds introduced w/ #5715 (
docker run builder | docker build -t runner
) into the main Dockerfile syntax.New Dockerfile keyword
BUILD
Usage:
Description:
BUILD
triggers a new docker build job using a given context directory (coming from the image rootfs being built), followed by new Dockerfiles instructions.Either the
path/to/Dockerfile
, or theDockerfile
are the root of the new<context>
if present or the instructions following the BUILD command are used in-lieu-of theDockerfile
of the new build job: in that order; if instructions follow the BUILD command andpath/to/Dockerfile
is set or<context>/Dockerfile
exists, the build is failed.The new build start with a fresh rootfs populated from the inner
FROM
instructions and a fresh environments and replaces the current build job.It is effectively equivalent to being able to call
RUN docker build /path/to/context
in aDockerfile
.Example:
Note that:
docker build -t
would tag the last chained build defined at the end the Dockerfile until there is a better support for naming distinct image inDockerfile
(the problem already exists today with multipleFROM
)/src/build
after theBUILD
block won't be represented in the chained imageBUILD
s can't make change to the previous directory being passed as the context, much like regular build can't mutate the context: the context gets tared like regulardocker build
today and the innerBUILD
operate on a copy.The text was updated successfully, but these errors were encountered: