-
Notifications
You must be signed in to change notification settings - Fork 18.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal: Incorporate docker run IMAGE into BUILD #8660
Comments
This proposal is amazingly detailed, so thanks for that -- it is a model for proposals. That said, I'm not sure it's a good fit for the builder. Builder is extremely declarative and simple, deliberately so. Having switches and the notion of a function really detracts from the goals here. Although, I'm not the final decider on such topics, (that would be @shykes) and there's plenty of opportunity for the community to convince us as well.\ /cc @tiborvass |
Thank you for your kind assessment concerning the detail of my proposal! Before replying in depth to your post, I would like to ensure my understanding of Builder’s declarative nature and its goals. An attempt to discover them through a Google search for “docker builder declarative goal”, as well as reviewing a few files in the Builder github repository were unsuccessful in revealing them. Therefore, a link to these definitions, or a statement of them would be appreciated to inform the conversation for others and myself. I’ve also examined Dockerfile syntax and its execution by Builder to unearth its declarative nature and would appreciate feedback regarding the observations below. Computational DependencyIn several declarative languages, arbitrary execution of operators occurs unless a computational dependency dictates a specific sequence. When a dependency becomes necessary, it’s typically expressed through a coupling mechanism encoded (implemented) via syntax. For example, 5 * sin(90), the multiplication and sin operators are syntactically coupled to one another, such that, the sin(90) must be (deterministically) calculated before computing the answer to the multiplication operator. However, in a Dockerfile, it seems adjacency/ordering of a command in the file determines the computational dependency, such that, given a command (C1), its successor, the one that follows it (C2=C1+1), is always dependent on the given command (C1). In other words, a successive statement deterministically couples to the one that precedes it. This is similar to the pipeline notion of a monad. Since Dockerfile statement ordering conveys deterministic intention Ex: 1 would be considered different than Ex: 2 although order in this case doesn’t affect the outcome: Ex: 1
Ex: 2
However, reordering the following commands of Ex: 3 will affect the targeted image of the initiating docker build command: Ex: 3
Although statement reordering changes the outcome of Ex: 3, this different outcome isn’t considered a side effect because the dependency between the statements, that’s encoded through the ordering of statements within a Dockerfile, would also be (deterministically) reordered. In other words, as currently encoded above, ADD ContextFile1 ImageFile1 will always (deterministically) be executed before ADD ContextFile2 ImageFile1. If the statements were exchanged, ADD ContextFile2 ImageFile1 would always (deterministically) be executed before ADD ContextFile1 ImageFile1. The following example, a rewrite of Ex: 3, employs syntactic coupling to express an equivalent encoding of a Dockerfile. In this example, Dockerfile commands are depicted as terms and parentheses denote precedence. Ex: 4
Is a successive statement deterministically coupled to the one that precedes it? Argument: File System & Image MetadataIn addition to the coupling mechanism conveying the notion of dependency, it delivers the desired thing, a variable’s value, as an argument (input) to the next computation. Referring to the 5 * sin(90) computation above, the term sin(90) generates the variable value (1) that will be passed as an input to the multiplication computation. Since, if as confirmed above, statements are coupled according to their ordering, so to is the delivery of a variable value. In a Dockerfile, the variable value can be considered an aggregate of an intermediate image’s file system and its metadata. The metadata being the values available through the docker inspect command. Although it’s not apparent when viewing syntax, each Dockerfile command following the first FROM receives this committed aggregate value as an argument (input). Similar to the examples above, Dockerfile syntax has been extended to convey the delivery of a variable value in a syntactic form for its FROM and ADD commands:
Although not syntactically encoded as an argument, is the file system and image metadata successively passed as an argument to the next statement? Avoiding Side Effects: Pure Functions & Single Assignment FormNearly all Dockerfile commands, are considered pure functions because the side effects within the imperative GO implementation language have no observable effect on Dockerfile state, their semantics adhere to a single assignment form (eliminating destructive reassignments), and the execution of a commit between each command ensures atomicity preventing unintentional overlapping state changes. True? The Exceptional RUNRUN deviates from other Dockerfile commands, as it executes one or more arbitrary linux commands/scripts without the protective benefit of any of the mechanisms mentioned above. For example, destructive reassignments can not only occur between linux commands, due to the absence of commit, but also within the execution of a specific command, as all commands read and write to the same image file system state. True? |
Reply to Concerns Reply to ConcernsThe reply below responds to @erikh concerns and provides an example which contrasts this proposal's promotion of the 'Function Idiom' to another solution called 'Nested Build' #7149 in an effort to more concretely demonstrate the technical merits of the Function Idiom proposal. Without feedback explaining Docker's notion of “extremely declarative”, a concept called 'declarative benefit' is defined below to provide some guidelines with which to qualify the quality of being declarative. Using this Declarative Programming definition and its associated links, the term 'declarative benefit' measures the quality/fidelity of a language mechanism to:
Contrary to the assertion (“…a function really detracts from the goals here”) that incorporating functions would oppose Builder’s goal to achieve “extremely declarative” semantics, functions can significantly improve Builder’s declarative benefit.
Contrary to the assertion of Builder being “extremely declarative” its supported Dockerfile commands:
In addition to solving the separation of Build vs. Runtime concerns, incorporating functions would:
It’s imperative that concepts/mechanisms, presented by a proposal, be first examined for their technical merit before considering implementation concerns, such as the syntax used to explain the mechanism or the difficulty in adapting the current code base.
Compare Function Idiom to Nested/Chained BuildSimple Dockerfile ExampleWhat follows below is a simple Nested Build example taken from #8021 which will be used to contrast this approach proposed by #7149, as implemented by #8021, to the Function Idiom. Parent build step can be also used for dynamic Dockerfile creation. In this case BUILD instruction needs to be the final instruction in the Dockerfile.
Number of build steps isn't limited. Last image gets the tag.
Below, the above Nested Build example encoded employing Function Idiom approach.
Substantive Dockerfile ExamplePurpose:To present a simple but more realistic Dockerfile example involving an existing Docker Hub image which transforms one or more input files into a dependent output file. The example consists of an initial go application that decides which one of two competing strategies to execute when solving a problem. The competing strategies are also written in go. All go programs are linked as static images. The selected Docker Hub image: "google/golang" executes a go compiler request converting source file(s) to a dependent executable. The resultant image's Dockerfile reflects the task of generating the executables and adding them to the minimal "scratch" image. Build Context:./goCompileStatic.sh
|
Best proposal ever. You win the Internet. |
Thanks @bketelsen for your encouraging review of the proposal! I'm hoping the community's support and the proposal's technical merit will convince core maintainers of its utility to improve a Dockerfile's ability to address a range of build concerns. |
Adoption of this proposal would negate the necessity for/complement the following proposed features:
|
Hello! Mainly:
Then from there, patches/features like this can be re-thought. Hope you can understand. |
Description
The processes, environments, and resources required to construct artifacts that compose a desired image are typically irrelevant to its runtime. For example, the tool chain: language compilers, their libraries, …, employed to build an application, when incorporated into an image for delivery, encumber the transport of the resultant image and potentially the execution of its derivative container(s). To avoid image pollution and facilitate delivery of a minimally sized one, the Docker BUILD environment must properly separate (isolate) the concerns of image construction from that of its runtime.
This proposal essentially recommends incorporating the docker run IMAGE command to impart the same benefits of containment (isolation/encapsulation), performance, and reusability that have contributed to Docker’s success, to its BUILD environment. It does so by remapping the concepts/implementation of the docker run IMAGE command to a function idiom represented by a new Dockerfile operator called “FUN” (FUNction run) that executes an already know image. A discussion of its benefits, sketch of syntax, overview of semantics, and an example are provided below.
In addition to the new FUN operator, this proposal introduces another one: DEF FUN, to declare/define the body of a transient image (function) within a Dockerfile. The body of a transient image is the set of Dockerfile commands (conceptually a Dockerfile within a Dockerfile) needed to construct it.
Benefits
The function idiom includes an interface definition that provides a coupling point at the boundary layer separating the internals of a function, its implementation, from the surrounding external invocation environment. At this coupling point, an interface features a mechanism to bind one or more external arguments to a corresponding set of variables internal to the function. This binding mechanism allows external argument names to be properly correlated, even if their names are different, to their internal counterparts, thereby, eliminating the need to synchronize argument names to mirror the variable names internal to a function and avoid binding to a function’s (or invocation environment's) implementation. Finally, since this binding mechanism occurs at each function invocation, it encourages function reuse, as the same function body can be called at various locations with differing argument names and values.
Syntax
The syntax presented below provides a means to explore concepts. For example, words beginning with '--', like --CONTEXT, reflect keywords whose final form remains undecided.
FUN
[--CONTEXT { [:]
[[:]... ]
| [--FROM_IMAGE :[]
[[:]... ] } ]
[ { --IN [ ]...
| --IN_IMAGE
[ ]... } ]
[--OUT [ ]... ]
{--NOCOMMAND | [] []}
: see docker run IMAGE command
: The files provided by the PATH or URL supplied by the initiating BUILD command.
: A build context assembled from and/or files available from the . This assembled context conforms to the interface expected by/expose to the image when performing BUILD processing.
: File paths/environment variables to be resolved within the function's ('s) body. Although, a Developer could specify a value instead of an environment variable name, avoiding harmful coupling to a function's implementation requires a level of indirection and an associated resolution process which the Dockerfile ENV provides. The use of ENV also offers a method to minimally document the function's interface via the docker inspect command.
: File paths/environment variables to be resolved within the context of the image being built at the moment of invocation.
: see docker run IMAGE command
: see docker run IMAGE command
Semantics
FUN's behavior presented using Dockerfile/docker operations when possible.
The example is intended to convey the proposed FUN semantics leveraging the experience of familiar commands. It's not a definitive implementation spec. Here's a written description of FUN's invocation:
Example
Given: An image called “appCompile” already created by the following Dockerfile:
Create the “app” image via this second Dockerfile:
Additional more substantive Docker Hub example using google/golang image. Example also contrasts Function Idiom approach to Nested/Chained Build.
Description
In addition to the FUN operator described above, the proposal would also include a mechanism to permit the construction of supporting transient functions (images) within a Dockerfile. The mechanism is similar to an inline function declaration which emerged during this discussion with Alexander Larsson.
Benefits
Syntax
: See docker run IMAGE command. Typically, it will be a human readable label reflective of the function’s primary responsibility using the :[] form. When using :[] form, assumes the default of “latest”. However, could also assume any other valid image label, like a short/long GUID.
Semantics
DEF FUN declares the start of a function (image) definition. When recognized, the current BUILD process writes the commands to a cached file until it detects the matching END FUN. The is placed into a function resolution table maintained by the current BUILD process. Whenever an image name requires resolution, to satisfy either a FUN or FROM operation, the resolution process first reviews the current local function resolution table for the given name. It spawns a child build process and passes the assembled by the initiating function invocation to this child. Once the child build process completes, the function (image), situated in the parent, is executed. The image generated by the child build process can be cached to satisfy future requests initiated by the same parent or a spawned (child) level. In situations where two functions share the same , the definition nearest to the FUN operator will be used. Local inline function definitions override any external function (image) that share the same .
Example
Previous example rewritten to employ DEF FUN aggregating the two distinct Dockerfiles into a composite one:
Additional DEF FUN example contrasting Function Idiom approach to Nested/Chained Build.
The text was updated successfully, but these errors were encountered: