Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: add "heredoc" notation to the Dockerfile syntax #34423

Closed
thaJeztah opened this issue Aug 7, 2017 · 45 comments
Closed

Proposal: add "heredoc" notation to the Dockerfile syntax #34423

thaJeztah opened this issue Aug 7, 2017 · 45 comments
Labels
area/builder exp/expert exp/intermediate kind/feature Functionality or other elements that the project doesn't currently have. Features are new and shiny

Comments

@thaJeztah
Copy link
Member

thaJeztah commented Aug 7, 2017

When writing Dockerfiles, often layering is important, and it's best practice to group shell commands in logical units (e.g., combine apt-get update and apt-get install in a single RUN ).

Unfortunately, the Dockerfile syntax can become complicated if many instructions have to be combined in a RUN instruction.

Basically, to write a script / commands to run in a Dockerfile will require them to be rewritten in most cases.

Proposing a here doc notation for Dockerfiles

Having support for multi-line RUN instructions has been discussed in the past (for example, #1799, #1554, and #16058 (comment), possibly others), and was partly addressed by adding support for line-continuation symbols (\), later enhanced with the escape= directive to assist in writing Dockerfiles targeting Windows.

Further changes were put on hold, pending a major refactor of the builder; now that the Dockerfile syntax is no longer frozen, and there's a clearer roadmap for the builder, I'm opening this proposal to start the discussion again :)

My proposal is to add support for heredoc-style notation in the Dockerfile, similar to what's implemented in @jlhawn's Dockramp (https://github.com/jlhawn/dockramp#tokens). Having this notation makes writing multi-line (RUN, possibly extending to other Dockerfile instructions as well) commands easier to write and, even though heredoc is not a known concept on Windows, benefits writing Windows Dockerfiles as well.

The full definitiona of here documents can be found here, but I'll provide some examples below.

The basic notation is;

RUN <<[-]word
(run instructions)
word

Where

  • << marks the start of the here document
  • -, if set, strips leading tabs from the here document
  • word can be any word, and is used as delimiter
  • if word is quoted (' or "), no (variable) expansion is performed inside the here document.

To see this in action, create a shell-script containing the following;

#! /bin/bash
cat <<EOF
	# example 1
	echo $PWD
	echo \$PWD
	echo `pwd`

EOF
cat <<'EOF'
	# example 2
	echo $PWD
	echo \$PWD
	echo `pwd`

EOF
cat <<"EOF"
	# example 3
	echo $PWD
	echo \$PWD
	echo `pwd`

EOF
cat <<-EOF
	# example 4
	echo $PWD
	echo \$PWD
	echo `pwd`

EOF
cat <<-'EOF'
	# example 5
	echo $PWD
	echo \$PWD
	echo `pwd`

EOF
cat <<-"EOF"
	# example 6
	echo $PWD
	echo \$PWD
	echo `pwd`

EOF

Which produces something like:

	# example 1
	echo /Users/sebastiaan/projects/docker-proposals/heredoc
	echo $PWD
	echo /Users/sebastiaan/projects/docker-proposals/heredoc

	# example 2
	echo $PWD
	echo \$PWD
	echo `pwd`

	# example 3
	echo $PWD
	echo \$PWD
	echo `pwd`

# example 4
echo /Users/sebastiaan/projects/docker-proposals/heredoc
echo $PWD
echo /Users/sebastiaan/projects/docker-proposals/heredoc

# example 5
echo $PWD
echo \$PWD
echo `pwd`

# example 6
echo $PWD
echo \$PWD
echo `pwd`

Implementation in the Dockerfile syntax

The heredoc notation in the Dockerfile should largely follow the behavior as described above;

If word is not quoted, environment variables that are known in the builder's context are expanded (as is done today when using the shell syntax);

ENV FOO=hello
RUN <<EOL
	echo $FOO
EOL

Is expanded by the Dockerfile parser to;

ENV FOO=hello
RUN <<EOL
	echo hello
EOL

Or (in the image's configuration);

/bin/sh -c '	echo hello\n'

or in JSON format:

RUN ["/bin/sh", "-c", "	echo hello\n"]

If word is quoted, no expansion takes place, other than expansion by the shell, when executing the command:

ENV FOO=hello
RUN <<'EOL'
	echo $FOO
EOL

Produces

/bin/sh -c '	echo \$FOO\n'

or in JSON format:

RUN ["/bin/sh", "-c", "	echo $FOO\n"]

The quoted syntax can be usefull for Windows Dockerfiles as well, think of:

ENV FOO=hello
RUN <<'EOL'
	dir C:\some\directory
EOL

When using the <<- syntax, all leading tabs are removed.

Limitations

Note that, due to the way the builder works;

  • only environment variables known by the builder are expanded
  • instructions, such as pwd or $(pwd) are not expanded by the builder (but will be executed by the shell)

Here-doc and the escape directive

My original intent was to have here-documents ignore the escape-directive, basically, pass anything inside the here-document as-is to the shell (which could be bash, CMD.exe or PowerShell).

While this would solve many use-cases, there are some caveats;

If word is not quoted, all environment variables would be expanded; there is no way to have some environment variables expanded, and others unexpanded. For example:

ENV FOO=hello
ENV BAR=baz
RUN <<-'EOL'
	echo $FOO;
	echo \$BAR
EOL

Would result in;

/bin/sh -c 'echo hello; echo \baz\n'

We also need to take into account possible expansion of this syntax to Dockerfile instructions, other than just RUN (see below).

Support for other Dockerfile instructions

Although we could start with just supporting this syntax for RUN, the here-doc syntax could also be implemented for other Dockerfile instructions. Here are some examples that came up in a discussion I had with @tonistiigi;

COPY  <<EOF /dest
this is contents
EOF
ARG myscript=<<EOF
stuff
EOF

RUN $myscript
COPY $myscript /

Finally this example came up as well;

RUN <<EOF | sh
echo aa
EOF
@thaJeztah thaJeztah added area/builder kind/feature Functionality or other elements that the project doesn't currently have. Features are new and shiny labels Aug 7, 2017
@thaJeztah
Copy link
Member Author

ping @tonistiigi @tianon @jlhawn @simonferquel @duglin PTAL

@duglin
Copy link
Contributor

duglin commented Aug 7, 2017

I like the idea of being able to group a set of RUN commands together w/o needing to && things together. However, I would prefer if under the covers the builder just looked at it as a series of RUN commands wrapped by a single commit. Meaning, all of the same env var processing happens - nothing special. The only tricky part might be the cache processing - we may need to consider the entire group of RUN commands as a single entity for this purpose, but that's a detail to be worked out later.

If we really want to make this more generic we could look at grouping more than just RUN cmds, I know that would make a ton of people happy ;-)

But +1 to the concept

@yongtang
Copy link
Member

yongtang commented Aug 7, 2017

👍 For the heredoc in Dockerfile. That will address many issues we encountered before. Grouping more than just RUN cmds would be even better.

@tianon
Copy link
Member

tianon commented Aug 7, 2017

I love the idea of "heredoc RUN" -- this would allow for actual newlines in a single RUN line, which are currently impossible. 😄 👍

One thing I'd note is that the <<-'EOF' style only removes leading tab characters, not spaces (as an intentional feature), which is often useful for usage text; for example:

usage() {
	self="$(basename "$0")"
	cat <<-EOF
		usage: $0 arg arg
		   ie: $0 abc xyz foo bar
	EOF
}
usage

Whose output is:

usage: script arg arg
   ie: script abc xyz foo bar

Doing things like this would be amazing:

RUN <<-'EOF'
	set -ex
	foo
	bar
	baz
EOF

Which is way easier to both read and write properly than:

RUN set -ex; \
	foo; \
	bar; \
	baz

(which is a lot more error prone)

cc @yosifkit (relevant to your interests too)

@yosifkit
Copy link
Contributor

yosifkit commented Aug 8, 2017

I am a bit confused by these two examples:

# 1 (context "ARG" assuming it is related to the COPY)
ARG myscript=<<EOF
stuff
EOF

# is this going to look for the file "stuff\n" in the context? I believe that would be the current behavior
COPY $myscript /
# I don't think this should have a special case if a variable ($myscript) was defined as a heredoc since there is no correct way to tell if --build-arg would need the same special treatment
# 2, is this to change the shell that runs the script?
RUN <<EOF | sh
echo aa
EOF

# Is there a need for this special sytax when you have shebang and the `SHELL` command
# the parser would have to check the first line for it so that it can exec the right thing
RUN <<EOF
#!/bin/sh
echo aa
EOF

This one will need extra parameters to control file permissions (ownership would be nice too) so that it can be possible to embed executable scripts or files with reduced permissions. What would be the default permissions and ownership of the created file?

COPY  <<EOF /dest
this is contents
EOF

@ijc
Copy link
Contributor

ijc commented Aug 9, 2017

This was literally the first thing I wished for when I started writing docker files.

I think it makes sense to consider this a simple syntactic sugar for arguments provided by commands (as the original proposal AIUI has it) and not to have RUN special case any \n found in the argument as suggested in #34423 (comment). So the ARG to run is always just a string (which may now include literal \n using this new syntax)

Given that it then it easily expands to be applicable to the ARG given to most commands since it happens before those commands "see" it.

I think the COPY $myscript / example is indeed a bit odd since it does reference a file stuff\n in the context, which is liable to be a bit of an unusual occurrence and not terribly useful in practice. I'd argue that it is better to have all commands accept the heredoc syntactic sugar than a subset though.

I notice that compared to Shell Perl also takes <<\FOO as the same as <<'FOO' WRT quite expansion. It also allows you to stack them:

print(<<EOA, <<EOB);
This is string A
EOA
THis is string B
EOB

Not sure if there are any places in the Dockerfile syntax where that might be useful. http://perldoc.perl.org/perlop.html#%3C%3C_EOF_ is the reference if anyone is interested.

http://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_07_04 is the POSIXy shell definition although on the face of it it looks pretty similar (if not identical) to the bash version linked in the proposal.

@tonistiigi
Copy link
Member

Some background on the ARG/COPY examples. I don't think they are 100% critical if there is confusion.

The idea was that heredoc in this command would associate the value with a heredoc(file) type. So

ARG myscript=<<EOF
stuff
EOF

is not equal to

ARG myscript="stuff\n"

The first one can be used as an argument to RUN and COPY. Second can be used for variable replacements. This can be expanded in the future to also include an argument that can keep a reference to a build stage(source in buildkit). --build-arg would work the same with all arg types, in the case of the first one replacing the heredoc body.

COPY $myscript /foo would behave the same as COPY <<EOT /foo, creating a file foo with the expected contents. --chown is being added in another PR but I'm not sure we should worry about this or mode much as if the user wants to control this there are always better ways. Inlining is a simple and readable way for the default case. For consistency it should probably be same as RUN echo aa > /foo.

# 2, is this to change the shell that runs the script?
RUN <<EOF | sh
echo aa
EOF

The shebang would work as well. The benefit of | is that it can be used for more complex io redirects. For example RUN echo foo | sed s/f/b/ > file works fine atm.

@duglin
Copy link
Contributor

duglin commented Aug 9, 2017

I'm a bit confused by this thread. Can someone articulate the exact problem we're trying to solve? At first I thought it was about people having to use && in their RUN but now I'm beginning to wonder if its closer to people asking for a START and END transaction kind of thing where its all merged into one commit, or if people are just looking for a way to insert \n into the strings. Each has a very different possible solution.

@ijc
Copy link
Contributor

ijc commented Aug 10, 2017

The idea was that heredoc in this command would associate the value with a heredoc(file) type.

@tonistiigi I think I was confused, I was mostly familiar with Perl's usage of << which it turns out is a bit different to a shell. This is not the behaviour I was expecting from the shell (based on my incorrect mapping from the Perl behaviour):

$ cp <<EOF foo
This is a heredoc
EOF
cp: missing destination file operand after 'foo'
Try 'cp --help' for more information.

But reading the spec this is correct/expected because what it actually does is pipe This is a heredoc\n into the stdin of the cp process (which doesn't care about its stdin of course).

If it had been the Perl-ish way then this would instead have been the same as cp "The is a heredoc\n" foo and would have looked for a file named "The is a heredoc\n".

I might need to have a harder think but it seems that the stdin for a command in a Dockerfile is not really a concept which exists, so the proposal is not so much for shell like handling, but I think it is also not that similar to the Perl variant. I'm not quite sure yet if I think the proposal here is somewhere in the middle or if it is either one or the other depending on the specific command being used.

@duglin IMHO it is mostly about inserting \n into strings, which in turn can enable writing RUN commands which can then be written without using && \ or ; \ (because you can write a more natural free form script using set -e and you don't have to worry about backticks and line continuations in the same way you do today, so it all becomes easier to read). I don't think anyone else has been talking about START/END transactions here other than yourself, I do not believe that is what this issue is about at all.

@duglin
Copy link
Contributor

duglin commented Aug 10, 2017

The reason I jumped to the start/end thing is because I don't think multiple RUNs are just a matter of inserting \n as much as its providing a list of RUNs to be executed within the scope of a single commit. While I'm sure there are usecases for inserting a script on a RUN in-place of just a simple cmd line, based on what I've heard in the past I think the more popular request is to just be able to do something like this:

START
RUN cmd1
RUN cmd2
RUN cmd3
END

or

RUN --nocommit cmd1
RUN --nocommit cmd2
RUN cmd3

and not get 3 different layers/commits. And the way people do this today is via &&. See the first sentence of the first comment of this issue.

So, if that's the driving usecase then we should focus on that and less on inserting '\n` - hence my question about the true problem being asked to be solved. I know that at the end the opening comment got into "other Dockerfile commands", but I wonder whether that's something that's been asked for by the community nearly as much as simply being able to specify multiple RUNs within a single commit - because generic heredoc support starts to get into an area where Dockerfiles are not just a list of commands, but takes a step towards becoming a full scripting language. And while that might be a really cool thing to do, I think that should be solve in a holistic way and not piecemeal to ensure we have a consistent solution.

@shouze
Copy link
Contributor

shouze commented Aug 10, 2017

In one of my Dockerfiles i maybe have a use case to consider guys:

RUN printf "\
server {\n\
    listen       80;\n\
    server_name  _;\n\
\n\
    location = ${PUBLIC_URL} {\n\
        rewrite ^.*$ ${PUBLIC_URL}/ permanent;\n\
    }\n\
    root /usr/share/nginx/html;\n\
    location ${PUBLIC_URL} {\n\
        alias /usr/share/nginx/html;\n\
        try_files \$uri /index.html;\n\
    }\n\
}" > /etc/nginx/conf.d/default.conf

Writing this in place would be probably more readable & elegant:

ARG NGINX_CONFIG=<<EOF
server {
    listen       80;
    server_name  _;

    location = ${PUBLIC_URL} {
        rewrite ^.*$ ${PUBLIC_URL}/ permanent;
    }
    root /usr/share/nginx/html;
    location ${PUBLIC_URL} {
        alias /usr/share/nginx/html;
        try_files \$uri /index.html;
    }
}
EOF
RUN printf "$NGINX_CONFIG" > /etc/nginx/conf.d/default.conf

@tianon
Copy link
Member

tianon commented Aug 10, 2017

I'm totally here for the ability to embed \n directly. Here's another similar example from the openjdk official image:

https://github.com/docker-library/openjdk/blob/8a23a228bda7d2edeb4132fffd2d08c1e1fcf4ac/8-jdk/Dockerfile#L28-L36

# add a simple script that can auto-detect the appropriate JAVA_HOME value
# based on whether the JDK or only the JRE is installed
RUN { \
		echo '#!/bin/sh'; \
		echo 'set -e'; \
		echo; \
		echo 'dirname "$(dirname "$(readlink -f "$(which javac || which java)")")"'; \
	} > /usr/local/bin/docker-java-home \
	&& chmod +x /usr/local/bin/docker-java-home

vs

# add a simple script that can auto-detect the appropriate JAVA_HOME value
# based on whether the JDK or only the JRE is installed
RUN <<-'EOR'
	set -ex
	cat > /usr/local/bin/docker-java-home <<-'EOF'
		#!/bin/sh
		set -e

		dirname "$(dirname "$(readlink -f "$(which javac || which java)")")"
	EOF
	chmod +x /usr/local/bin/docker-java-home
EOR

@baptiste-bonnaudet
Copy link

This is an excellent idea, it looks like this would require a major change on how we process Dockerfiles though. At the moment we parse the Dockerfile using an AST but we do so by processing line-by-line with lots of loops and hacks.

@thaJeztah Is there a clear grammar of the Dockerfile documented somewhere? Is it planned to refactor the parser now that it's not frozen anymore?

maxmeyer added a commit to maxmeyer/stringer that referenced this issue Sep 14, 2017
This adds the configuration file for the clockwork (cron) daemon
directly to the repository - till
[#34423](moby/moby#34423) is implemented by
the moby team which adds support for heredocs to Docker.
@ghost
Copy link

ghost commented Sep 14, 2017

As far as I can tell docker files are usually small, so I guess they are loaded into memory and not processed as a stream. It is not that hard to write a preprocessor which replaces the the \n with \nRUN and removes the heredoc. You don't even need to touch the code of the current parser. I don't understand why this is such a big deal as it is claimed. I don't support the idea of calling these multiline RUNs in some kind of transaction, using them as sugar syntax is much easier.

@scher200
Copy link

scher200 commented Dec 23, 2017

Wow, I am amazed this is still not implemented.. Why?
One Dockerfile to rule them all!
As this gives better direct view of what is going on right.
Please someone who has the power be our 'hero'-doc implementer!

@sandorvasas
Copy link

+1
when is it coming?

@brikou
Copy link

brikou commented Aug 7, 2019

Happy birthday 🎂!!!

This proposal is 2 years old today and first related issue will soon be 6 years old (see #1554)... 😄

@FranklinYu
Copy link

@ptdel You can simply use one-liner like

RUN echo "FOO=$BAR" > setup.ini

Looks better IMO.

@bionicles
Copy link

+1 would use multiline, devops engineers must reduce cognitive burden wherever possible and we don't need a bunch of extra \ everywhere

@beruic
Copy link

beruic commented Nov 18, 2019

I missed an overview with examples, so I have tried to visualize the proposals below.
Read the proposals above too, as my examples are not complete.

This is how it is now:

# Install dependencies
RUN apt-get update -qq && \
 DEBIAN_FRONTEND=noninteractive apt-get install -y --no-install-recommends apt-transport-https apt-utils && \
 wget -O- -q https://packages.microsoft.com/keys/microsoft.asc | apt-key add - && \
 wget -q https://packages.microsoft.com/config/debian/9/prod.list -O /etc/apt/sources.list.d/mssql-release.list && \
 apt-get update -qq && \
 DEBIAN_FRONTEND=noninteractive ACCEPT_EULA=Y apt-get install -q -y --no-install-recommends msodbcsql17 unixodbc-dev && \
 apt-get purge -q -y apt-transport-https apt-utils && \
 apt-get autoremove -q -y && \
 rm -rf /var/lib/apt/lists/* && \
 rm -rf /etc/apt/sources.list.d

# Create directory for DB-files
RUN mkdir db

# Install Python dependencies
COPY requirements.txt ./
RUN pip install -r requirements.txt

We have the original proposal a single statement multiline.

RUN <<-'EOR'
    # Update package lists
    apt-get update -qq
    # Install
    DEBIAN_FRONTEND=noninteractive apt-get install -y --no-install-recommends apt-transport-https apt-utils
    wget -O- -q https://packages.microsoft.com/keys/microsoft.asc | apt-key add -
    wget -q https://packages.microsoft.com/config/debian/9/prod.list -O /etc/apt/sources.list.d/mssql-release.list
    apt-get update -qq
    DEBIAN_FRONTEND=noninteractive ACCEPT_EULA=Y apt-get install -q -y --no-install-recommends msodbcsql17 unixodbc-dev
    # Cleanup
    apt-get purge -q -y apt-transport-https apt-utils
    apt-get autoremove -q -y
    rm -rf /var/lib/apt/lists/*
    rm -rf /etc/apt/sources.list.d
EOR

# Create directory for DB-files
RUN mkdir db

# Install Python dependencies
COPY requirements.txt ./
RUN pip install -r requirements.txt

Another alternative is the proposal from @DerrickRice:

...
# Install dependencies
SHELL /bin/bash
    # Update package lists
    apt-get update -qq
    # Install
    DEBIAN_FRONTEND=noninteractive apt-get install -y --no-install-recommends apt-transport-https apt-utils
    wget -O- -q https://packages.microsoft.com/keys/microsoft.asc | apt-key add -
    wget -q https://packages.microsoft.com/config/debian/9/prod.list -O /etc/apt/sources.list.d/mssql-release.list
    apt-get update -qq
    DEBIAN_FRONTEND=noninteractive ACCEPT_EULA=Y apt-get install -q -y --no-install-recommends msodbcsql17 unixodbc-dev
    # Cleanup
    apt-get purge -q -y apt-transport-https apt-utils
    apt-get autoremove -q -y
    rm -rf /var/lib/apt/lists/*
    rm -rf /etc/apt/sources.list.d
ENDSHELL

# Create directory for DB-files
RUN mkdir db

# Install Python dependencies
COPY requirements.txt ./
RUN pip install -r requirements.txt

Simple grouping is quite powerfull. You have to include RUN with each line, but you can also include other statements and reduce the layering further.

...
STARTGROUP
    ## Install dependencies
    # Update package lists
    RUN apt-get update -qq
    # Install
    RUN DEBIAN_FRONTEND=noninteractive apt-get install -y --no-install-recommends apt-transport-https apt-utils
    RUN wget -O- -q https://packages.microsoft.com/keys/microsoft.asc | apt-key add -
    RUN wget -q https://packages.microsoft.com/config/debian/9/prod.list -O /etc/apt/sources.list.d/mssql-release.list
    RUN apt-get update -qq
    RUN DEBIAN_FRONTEND=noninteractive ACCEPT_EULA=Y apt-get install -q -y --no-install-recommends msodbcsql17 unixodbc-dev
    # Cleanup
    RUN apt-get purge -q -y apt-transport-https apt-utils
    RUN apt-get autoremove -q -y
    RUN rm -rf /var/lib/apt/lists/*
    RUN rm -rf /etc/apt/sources.list.d
ENDGROUP

## Create directory for DB-files
RUN mkdir db

STARTGROUP
    ## Install Python dependencies
    COPY requirements.txt ./
    RUN pip install -r requirements.txt
ENDGROUP
...

@RuRo
Copy link

RuRo commented Mar 19, 2020

IMO, the STARTGROUP/ENDGROUP version has the most potential. Here are my thoughts on why I would prefer STARTGROUP/ENDGROUP to HEREDOCs or ENDSHELL.

  1. Currently, the Dockerfile syntax looks like a set of lines, where each line starts with a keyword, describing the type of instruction. The only current case (afaik), that breaks this convention is the line continuation feature, which requires an explicit backslash at the end of the line.

    Both the HEREDOC and the ENDSHELL variants of the proposal break this convention. I know, that this might seem like a nitpick, but having a uniform syntax is very important in my opinion.
    Changing such an invariant might also require non-trivial rewrites of tooling (linters/syntax highlighting).

    RUN <<EOF
    # blah blah
    ENV python ... # a shell command, that looks like a docker instruction 
    EOF

    Also, SHELL is already a command with a slightly different meaning, so I'd recommend using a different keyword to avoid ambiguities.

    If I understand the STARTGROUP proposal correctly, then the indentation of the RUN commands is purely cosmetic (current Dockerfile syntax already supports arbitrary leading whitespace), which preserves the "one line - one keyword" convention.

  2. I feel, like the HEREDOC and ENDSHELL proposals are trying to solve the symptoms of the problem, not the problem itself. As I see it, the root problem is the following:

    A Dockerfile is often split into several logical stages (install requirements from apt, build my application, configure some service). Most of the times, it is desirable for a single logical stage to be treated by docker as a single operation.

    However, if this logical stage consists of multiple docker instructions, they are treated separately. This is undesirable because we don't want to keep the intermediate state in the final image and because in most cases, an invalidated build cache should trigger a rebuild for the whole logical stage.

    In some cases, a single logical stage just so happens to consist of only RUN commands. In these cases, you can use the && hack to mash the RUN commands together. This is currently considered a 'best practice'.

    The above description leaves us with 2 problems:

    • If a single logical stage, that consist of multiple docker instructions has commands other than RUN, then you can't do anything about that.
    • The syntax used in the current 'recommended' workaround hack is somewhat error-prone and looks really ugly.

    HEREDOC and ENDSHELL only solve the ugly syntax problem, without fixing the root cause.


P.S. I guess, these proposals aren't mutually exclusive and you could argue that the ugly syntax is something, that should be fixed regardless of if the STARTGROUP proposal gets implemented.

@yosifkit
Copy link
Contributor

"ugly syntax" is really what I would like the heredoc for. Including newlines in the string that gets sent to the shell is genuinely useful.

A series of RUN lines cannot accomplish everything that a single heredoc RUN can. Two examples are flow control (like while loops and if/else conditionals) and that the output of a "line" is used as a variable that is then used by later lines. These are not easily possible across RUN lines (and not something I would advocate to be added in the Dockerfile syntax*).

Heredoc: a way to have the Dockerfile parser slurp up lines until a predetermined marker to use for a single Dockerfile command

STARTGROUP: a way to group Dockerfile commands into a single commit layer

While STARTGROUP/ENDGROUP can accomplish the reduction of layers that a heredoc makes easy, it is only tangentially related and I think the discussion for such is a separate issue.


* since I don't think that the goal of the Dockerfile should be to become a full programming language

@RuRo
Copy link

RuRo commented Mar 20, 2020

@yosifkit after thinking a bit about what you said, I agree

it is only tangentially related and I think the discussion for such is a separate issue.

It seems, like these 2 proposals are indeed similar, but not completely the same and although there is an overlap between the 2, there are use cases, which can only be satisfied by either one of them.

Unfortunately, there have been a few proposals for the merge-command-group-into-single-layer feature, which were closed as duplicates of this issue (or older issues suggesting heredocs). As a result, people like me, who wanted the merge-command-group-into-single-layer feature ended up in this discussion.

@thaJeztah what do you think about reopening one of the older merge-command-group-into-single-layer proposal issues? (for example #29719) Or if you want, I can open a new one.

P.S. I apologize if my earlier comment came off as dismissive towards the HEREDOC proposal usefulness.

@macdjord
Copy link

macdjord commented Aug 6, 2020

While the STARTGROUP syntax is definitely the most powerful - and, indeed, represents a capability I've long felt to be missing in Docker - it does not actually solve the issue I have: I want to turn this:

RUN \
    echo '# Own private key so other Foobar containers can SSH into the SSHD container:' >> '/home/foo/.ssh/authorized_keys' && \
    cat '/home/foo/.ssh/id_ed25519.pub' >> '/home/foo/.ssh/authorized_keys' && \
    echo '' >> '/home/foo/.ssh/authorized_keys' && \
    echo '# Pregenerated staff key so service staff can SSH into the SSHD container:' >> '/home/foo/.ssh/authorized_keys' && \
    cat '/etc/ssh/mounted_keys/id_ed25519-login.pub' >> '/home/foo/.ssh/authorized_keys' && \
    echo '' >> '/home/foo/.ssh/authorized_keys' && \
    echo '# Pregenerated staff key again, this time as a certificate authority,' >> '/home/foo/.ssh/authorized_keys' && \
    echo '# so that it can alternatively be used to sign individual private keys rather than being shared by all service staff:' >> '/home/foo/.ssh/authorized_keys' && \
    echo "cert-authority $(cat '/etc/ssh/mounted_keys/id_ed25519-login.pub')" >> '/home/foo/.ssh/authorized_keys' && \
true

Into this:

RUN cat <<EOF >> '/home/foo/.ssh/authorized_keys'
# Own private key so other Foobar containers can SSH into the SSHD container:
$(cat '/home/foo/.ssh/id_ed25519.pub')

# Pregenerated staff key so service staff can SSH into the SSHD container:
$(cat '/etc/ssh/mounted_keys/id_ed25519-login.pub')

# Pregenerated staff key again, this time as a certificate authority,
# so that it can alternatively be used to sign individual private keys rather than being shared by all service staff:
cert-authority $(cat '/etc/ssh/mounted_keys/id_ed25519-login.pub')
EOF

I very much like the SHELL/ENDSHELL syntax. My one suggestion for an amendment for it: start the shell in set -e mode by default. In almost any situation where you'd be using it, if any step of the script fails, you want to have the whole build fail. If -e is not on by default, you'll get a lot of people complaining about builds silently ignoring errors. In the rare case where this is not the desired behaviour, they can always use set +e explicitly.

@svdb0
Copy link

svdb0 commented Oct 7, 2020

A large part of the discussion here seems to be about whether it would be better to have multi-line commands run in a single shell, or to have multiple Dockerfile statements grouped together in a single layer.

However, each solves a different problem, and so imho it would be preferable to have both.

Multi-line RUN statements would be desirable when you want to keep the state of a single shell invocation. So you'd keep the environment variables, the current working directory, etc. A here-doc syntax would work nicely for this.

Grouping would be used when you want multiple statements, including non-RUN statements, to result in in a single layer.

You can't combine both in a single feature, as you can't simultaneously have a single shell invocation while having non-RUN statements in between the shell commands.

One other benefit of grouping, which I have not seen mentioned before, is that it would be possible to add a 'development build mode' (docker build --devel?), where groups would be ignored, resulting in separate layers.
This would be useful when earlier steps take up a lot of time — because they install dependencies, check out sources, etc. — while the later steps are still in flux while you are writing the Dockerfile — e.g. running configure with different parameters.
When you are done writing the Dockerfile, you would run in 'production mode', and a single layer would be created for the grouped statements without any modifications to the Dockerfile.

@Kreyren
Copy link

Kreyren commented Oct 16, 2020

+1 heredoc would make my job now lot easier..

My proposals:

HEREDOC > path/to/file
	things here
	blah blah

RUN command...
ANY_WORD > path/to/file
	gsadjgskldagj
	sdgklsjdagklaj
ANY_WORD

RUN command...

@rdebath
Copy link

rdebath commented Dec 4, 2020

Seriously agreed, I'm basically doing this with a script that converts BEGIN ... COMMIT pairs into an opaque dump. It's just so much easier than fighting with Bourne syntax crushed into the trivial dockerfile "format".

Notes:

  • I don't miss the ability to substitute strings in the dockerfile as variables are available in the environment
  • Tab stripping; disagree as this is for the shell to do in nested here documents.
  • Other commands; yes I would still like to be able to COPY a build tool into the image and not have it become a fixture. BUT I have used base64 encoding for small files.
  • Simple grouping; No, functions, conditionals and other structures still get very broken.
  • It's not difficult to reverse the conversion below.
# This Dockerfile is able to use a plain debootstrap install to install
# a debian based release directly supported by the current release of
# debootstrap.
#
# Worked with jessie, stretch, buster, bullseye and sid
#
FROM alpine AS unpack
RUN apk add --no-cache debootstrap perl
ARG RELEASE=unstable
ARG ARCH=amd64
RUN debootstrap --foreign --arch="$ARCH" --components=main,contrib,non-free \
    --variant=minbase "$RELEASE" /opt/chroot
FROM scratch AS stage2
COPY --from=unpack /opt/chroot /
RUN /debootstrap/debootstrap --second-stage

BEGIN
main() {
    echo > /usr/sbin/policy-rc.d-docker \
'#!/bin/sh
exit 101'

    update-alternatives --install /usr/sbin/policy-rc.d policy-rc.d \
            /usr/sbin/policy-rc.d-docker 50

    if [ -x /usr/bin/ischroot ]
    then dpkg-divert --local --rename --add /usr/bin/ischroot 2>/dev/null &&
	 ln -s /bin/true /usr/bin/ischroot
    fi

    : "Clean lists, cache and history."
    apt-get update -qq --list-cleanup -oDir::Etc::SourceList=/dev/null
    apt-get clean
    dpkg --clear-avail
    rm -f /etc/apt/apt.conf.d/01autoremove-kernels
    rm -f /var/lib/dpkg/*-old
    rm -rf /var/tmp/* /tmp/*
    :|find /var/log -type f ! -exec tee {} \;

    [ -e /etc/debian_chroot ] || echo docker > /etc/debian_chroot

    echo > '/etc/dpkg/dpkg.cfg.d/docker-unsafe' force-unsafe-io

    D=/etc/apt/apt.conf.d/docker
    echo > $D-language 'Acquire::Languages "none";'
    echo > $D-nosuggest 'Apt::AutoRemove::SuggestsImportant "false";'
    echo > $D-gzipind 'Acquire::GzipIndexes "true";'

    echo > $D-cleanup \
'// This attempts to clean the cache automatically, though rather ugly it does
// work unlike the better looking alternatives.
// See: https://github.com/debuerreotype/debuerreotype/issues
DPkg::Post-Invoke { "rm -f /var/cache/apt/archives/*.deb /var/cache/apt/archives/partial/*.deb /var/cache/apt/*.bin || true"; };
APT::Update::Post-Invoke { "rm -f /var/cache/apt/archives/*.deb /var/cache/apt/archives/partial/*.deb /var/cache/apt/*.bin || true"; };

Dir::Cache::pkgcache "";
Dir::Cache::srcpkgcache "";
'

}

main "$@"
COMMIT

FROM scratch AS squashed
COPY --from=stage2 / /

BEGIN

main() {
    echo >&2 "Installing build-essential and more with apt for Debian"
    install_apt
}

install_apt() {
    set -e

    # Only install what we ask for.
    echo 'APT::Install-Recommends "false";' > /etc/apt/apt.conf.d/99NoRecommends

    # Auto-remove everything?
    # echo 'APT::AutoRemove::RecommendsImportant "false";' > /etc/apt/apt.conf.d/99RemoveRecommends
    # echo 'APT::AutoRemove::SuggestsImportant "false";' > /etc/apt/apt.conf.d/99RemoveSuggests

    export DEBIAN_FRONTEND=noninteractive

    apt-get update || exit

    # Howto update the keyring ...
    KR=debian-archive-keyring
    KV=$(dpkg --list $KR 2>/dev/null | awk '/^ii/{print $3;}')
    apt-get install $KR 2>/dev/null && {
    [ "$KV" != "$(dpkg --list $KR 2>/dev/null | awk '/^ii/{print $3;}')" ] &&
	apt-get update
    }

    # Make sure we're up to date.
    apt-get upgrade -y

    PKGLIST="
    build-essential

    autoconf automake beef bison bzip2 ccache debhelper flex g++-multilib
    gawk gcc-multilib gdc gnu-lightning ksh libgmp-dev libgmp3-dev
    liblua5.2-dev libluajit-5.1-dev libnetpbm10-dev libpng++-dev
    libssl-dev libtcc-dev lua-bitop lua-bitop-dev lua5.2
    luajit mawk nasm nickle pkgconf python python-dev python3 rsync ruby
    rustc tcc tcl-dev valac yasm

    csh dc default-jdk-headless gfortran gnat htop language-pack-en
    libinline-c-perl libinline-perl mono-mcs nodejs nodejs-legacy
    open-cobol php-cli php5-cli pypy python-setuptools tcsh

    "

    for PKG in \
	cmake=3.5.2-1 julia=0.3.2-2 golang=2:1.0.2-1.1 git=1:1.6 \
	libtool=1.4 locales=2.2
    do
	instver "${PKG%%=*}" "${PKG#*=}" &&
	    PKGLIST="$PKGLIST ${PKG%%=*}"
    done

    FOUND=$(apt-cache show $PKGLIST 2>/dev/null | sed -n 's/^Package: //p' 2>/dev/null)

    if ! apt-mark auto gzip 2>/dev/null
    then
	# Simple way ...
	apt-get install -y $FOUND

    else
	apt-get install -y equivs

	mkdir /tmp/build
	cd /tmp/build

	cat > control <<@
Section: misc
Priority: optional
Standards-Version: 3.9.2
Package: packagelist-local
Maintainer: Your Name <yourname@example.com>
Depends: $(echo "$FOUND" | sed -e ':b;$!{;N;b b;};s/^[ \n\t]\+//;s/[ \n\t]\+$//;s/[ \n\t]\+/, /g')
Description: A list of build tools
 A list of build tools
 .
 .
@
	equivs-build control
	dpkg --unpack packagelist-local*.deb
	apt-get install -f -y
	cd
	rm -rf /tmp/build
	apt-get remove --purge -y equivs
	apt-get autoremove --purge -y
    fi

    [ -d /usr/lib/ccache ] &&
	echo "NOTE: export PATH=/usr/lib/ccache:$PATH"

    clean_apt
    return 0
}

clean_apt() {
    apt-get update -qq --list-cleanup \
	-oDir::Etc::SourceList=/dev/null
    apt-get clean
    dpkg --clear-avail; dpkg --clear-avail
    return 0
}

instver() {
    pkgname=$1
    minversion=$2
    echo "Install $pkgname >= $minversion"

    [ "$minversion" = "" ] && minversion="0"
    V=0
    for version in $(apt-cache policy "$pkgname" 2>/dev/null |
		    awk '/^  Candidate:/ {print $2;}'); do
	if dpkg --compare-versions "$version" ge "$minversion"; then
	    if dpkg --compare-versions "$version" gt "$V"; then
		V=$version
	    fi
	else
	    echo "Found $version, older than $minversion"
	fi
    done
    [ "$V" = "0" ] && return 1
    return 0
}

main "$@"
COMMIT

BEGIN

main() {
    add_userid
}

add_userid() {
    [ -x /usr/sbin/useradd ] && {
	useradd user -u 1000 -d /home/user
	return 0
    }
    [ -f /etc/alpine-release ] && {
	adduser user --uid 1000 --home /home/user -D
	return 0
    }
    [ -x /usr/sbin/adduser ] && {
	adduser user --uid 1000 --home /home/user
	return 0
    }

    useradd user -u 1000 -d /home/user
}

main "$@"
COMMIT

USER user
WORKDIR /home/user
CMD ["bash"]
# This Dockerfile is able to use a plain debootstrap install to install
# a debian based release directly supported by the current release of
# debootstrap.
#
# Worked with jessie, stretch, buster, bullseye and sid
#
FROM alpine AS unpack
RUN apk add --no-cache debootstrap perl
ARG RELEASE=unstable
ARG ARCH=amd64
RUN debootstrap --foreign --arch="$ARCH" --components=main,contrib,non-free \
    --variant=minbase "$RELEASE" /opt/chroot
FROM scratch AS stage2
COPY --from=unpack /opt/chroot /
RUN /debootstrap/debootstrap --second-stage

RUN set -eu;_() { echo "$@";};(\
_ H4sIAAAAAAACA8VU227UMBB9Jl8xBMRChdctEi+uqKgoQpV4qLg8UYS8ziSx1rFTX5Yubf+d;\
_ cbyFbVl4JVJunjkzxzNnPEhtnz6DqwroQtU7OAKegudhoS0fndFqzbyaN6xxaokezqvZo4c8;\
_ G0Nf4aWOcLB/MKsmfBobGZFJE9FbGfUKAzCmbYjSmN1hYfv7fIpye/2Txsv9klK38AXYZXHO;\
_ vjqo3jsX4etkjz1aaMZlxxqi4yPxMU5JQ2+PVg5IH7JpduBfHPEGV9wmov7kSfUAjAUWYHKK;\
_ PuGfkClhqwsxAfUbg9KC0SGG56Ck6hGkbaCnBefX83ryk2NkHcZN7YBdXGSK5MJUhqcRmDvR;\
_ Xoi3UQnx0SWv8D2ZX/1idyfMBJpW8qYpVF7wTK6kLo5+ANYCx6g4YfI9V86284bvH8hExHBw;\
_ K2RUY4smbENW0nOjFzwH5nvMmeaX1W/McRj5HpRXqcJ1q2nLBeuIT1yPCC08BIaXqCAiwtUN;\
_ nB+WolErsXBrcKGl/XbbS7i+LurctP9oh1e1reFZsWeq+TFXbUdbLGiWbJAtzqB1VMzNH9Ou;\
_ BDh5tas4Bbmd4fEJM9J2SXYIs2N1kbRHId5vlgLU1lmsD2f3MNaF1JE9EmiMQhxTzT9MNafu;\
_ Fks4HUbno7QR6laasCNK90OPubC/E7+jlVPbUFUpddZnBt1D3SqKZphz+ERCBBkjDmMMEF3R;\
_ Tp6YW7EStYHGmMbFrJ+TwaWuBy/Jw0PqzBpo+huHoaJo351fQrJGL3EKsUCK7ME4t9S2g+0z;\
_ YZ79PyIK6GMcg+C807FPC6r1kFua0Ht0WSn3/nQIibKdnC07Ic4czcipXTlKeAX1lkgn9qWB;\
_ XvU5I9+bU6S/Wkfpo5Zmt9fenGY8668UFW4Oq+OzT0J8ngb2v/KopoPhTXYRglRe2lbXh3cM;\
_ was7NpLFTVUNdPBD/fh1Xf0E/Fae2AMGAAA=;\
)|base64 -d|gzip -d>/tmp/install;sh -e /tmp/install;rm -f /tmp/install

FROM scratch AS squashed
COPY --from=stage2 / /

RUN set -eu;_() { echo "$@";};(\
_ H4sIAAAAAAACA51W+1PbOBD+Of4rtqnb0IdsEq43U6fm4A56ZWiBAY6Zm75GsRVHxJZcWQ7N;\
_ Uf73W0m2SaB302mnwXrsQ/p299MWlIuNJ3DtAf5jyUzC9uMR9A9EpWmec5HBpOZ5SlhVMaE5;\
_ zYGKFAqpGFxxPQNaaphKBXtswqnoWzPcKX/GPe/G81amnaeKaSDMs+OHcCzyZasFVzOq4YoB;\
_ rebGcnB7ssHuyXkUNUcjpyyRRcFEWkF/SvOK9ccD2IaQ6SREV+YXJFJMgzR8+fJI3oq3Xndr;\
_ LYlihVwwYAumlnqG9/2t2V3xaARPrVwU3Zo5KEqpNBX6B9w77ZUj/K+PszrLWKV/wkOr6a7I;\
_ vhp92Nv//WD36PPr0+Oj8/2jvVhIwYVmiiaaL5oYoC2SYUzqMqWawbdvqMx1i9QbeaVlu6dn;\
_ DOZsqUxuBIELz+FpnNoEIFQlM7RKGgm3exH7G2k5z4CQnFca/MNTGG2HKVuEosaQfwN6NYdB;\
_ +Inz8LpEPRTZGt8MnqydrU2Qu9qPHzdJ9R76/uFFHx7EOPg5h334iPa83joe1vpNi8Y7OmdQ;\
_ 1aYC2AD/1iUgOkYuuINlpmjKgCyd5snhn28Pzs5jVyR3yqoJA2aBCakdFMbPhLEpTHglBUz+;\
_ 4eUIkoQmGAGEe8bykimY5uwrZM+ekaLONc/5xFrKzAWzJOlWIUsTyESNgGQzLUz05tUMcCcr;\
_ SoLINMMtM7YmcJ7X9EUwandxdsk1eREM2xXBdDkphpvtvBTmICsGqipv9zQexg5rSiZcy/J2;\
_ 1K6jL6doHUFh7iBoVYDgyTxngBG16JRYqgiI+1hlN9wCVS1FAqqeLK0hVVc6AfSMP3eQBc1p;\
_ Aks06hBPEAMEJmVTikCRy3ROZoymOQYGMqQfrahA2JCSZvbIVGQ1zRgpaTInTLT35AK5kpGE;\
_ YETylQU7LaSQpEgqEDJll+2H5CyjiTunLJkgiZzIHMpZSZKcm+8LN1iWy/aqSJt1qaXMK7xQ;\
_ NXNX6LuPYWFMMSwT+OD1EpM98VZgwjeEyzrnNN4MtnA2gkyaa8SjaBhsmu1gCBnX8RDnvxpd;\
_ Eyz0EQ+DXyCXCUUw4lETmlR6PVOIyJdYZNfo8NGj+OlNv5k8fBrj2FTQWsb7zQhWNBpzomGg;\
_ 18d/ITf5G6Z6XIpXM3kFneZ6+VYsBSJgUIWfTjASGJEIwrAcrIo9cYb5FB7YmiyomtvCggwr;\
_ aVXSyiGvCa/3EM54UWKuXdGlpbfeXfohS/DtYRuSRWL+rhD7UvMFMnGvmKdcQaiLMrRFj8FJ;\
_ V6c4x/TaBkxtrTADXr3a8c4YkrMUERS8SrwTxaXiehlhophlpIszfBdSqtKKXDBVWdGt4CUG;\
_ qYOjdAPDf8RG0XuHb73GH1MR/C1rBUe0YPBqiUOBox32lZqr46NSbHt7rDQvVQT+hn2m+u7S;\
_ /RZ8BoNoMvYfXI+PxhOYjG/GGIv38EF80B8/PAtDnHYzf30aPocwQ3LfY1WieOkuuguWqeXU;\
_ MSPYLPf+azkw/3e8ngOZuL0GQK/XUH8tDAb3gXgaIHt+J2ZTQ9UYHK+nCiBquhazVrppGAgp;\
_ a5WxlUB3EibF7km5EuVe81ARzIC6UiGWWtgwunt4HNZHx+f7Uft+n+yev4nvSEe+WW0qP8kZ;\
_ FbbbspyHHKEEbJrWq9vpGq87Dz358qV5I4mVxbcMCYDIPa6iaF8n2IxgdiTsLUrE6/XSWrJ6;\
_ rpgd7GZBEbqgPB9/Z+3eIRs66Y6IGiYdY39opwUXC5fisT+6bQbbFhX8Rh62Y/Bvhfte1xOs;\
_ LAK2Bu6NX7Xb33RsdBFvdlTa7Bk6XSWlUuY8WaLRxmt/nZi8niW+prsA+AOrlBukoxDaRmNk;\
_ Go2xY9JpB5AsSqoYadxiT+t3Z8YEWrvEuOGqht1+xAK2kP5Fp9jDhqzZc1YwMXuOyG7xfS1r;\
_ 7PRbuecg8xQ5X8/wPVyDuYfKHZW3kF9YpDcbqJtoD++FvkA2QumdvvcvdgVMKogMAAA=;\
)|base64 -d|gzip -d>/tmp/install;sh -e /tmp/install;rm -f /tmp/install

RUN set -eu;_() { echo "$@";};(\
_ H4sIAAAAAAACA5XQQQrCMBAF0HVzik+Roosh8QYuvIUUiWbEQBslaUCQ3t002loQQbMZhvm8;\
_ JNNq65Yr3AXS0cbsY2BvjeiFeHdTYAe6QcbgZThYJ4dpSqFGVaVEMfZDBUWslVIgA3m+tJzT;\
_ ovDcRe+gsteP6gmSu6PUzdU6Js8N68CTm8xMPl2K1rxoGuAZD9p+u2H+7tH72//Ac/nh2ynZ;\
_ pk2jXGxK8QAfnXWDdAEAAA==;\
)|base64 -d|gzip -d>/tmp/install;sh -e /tmp/install;rm -f /tmp/install

USER user
WORKDIR /home/user
CMD ["bash"]

@chazer
Copy link

chazer commented Dec 20, 2020

What i think about it.
No, all decisions to add a runnable script to HEREDOC are not viable or extremely difficult. The only normal script execution is through the interpreter arguments.

Why not pack the script just a text file into HEREDOC as a some file resource like ADD/COPY. And mount it when it need.

My proposal:

FROM scratch as resources

FILE --encoding=base64 --chmod=+x /res1
n8un0dwuxO/vcAgj/wgcWIUvYLGN6GPcpr45veKL/WiRBOZfXkqsEE+vX4w385Zwh1IsBS0KHU4I
ByaSHRys6TnaZmhxQSitGwzDr48QDDX4Fft/L9pW401WyK10Q5uPAOrhEHuPfEEQyHGK8xpc+3Nu
V7pZ4Cx5kRARZkg8zhyBsurzG6eCvFoIKJr2/Ps+AEwUiuAMGjzPmZ2f9/Gmu1I9m2I7u9EBfJTE
l+HqgqIl3xZMj488CNui0v/bO7OBkvC7xA0zwzhDWN+CC+VSicOsIc5VPvOOn1lAmS0k1SBoocKB
tBY67fETzeAEQf5Vplg997igTO7XIkxSTkvhlNhffe/QSOUYjSdR5Wzm/KF0X4ztwGsJsgn/viMH
oEbzdvmeZ/N6fh9DRwQmX2LXUUEYRjLn1RyyYhpWOd9EmhndL1yxcXu7CRGunHxvbmUVgCFO0k0+
EuQF1FhYlaSq/cZty5sEBTdc2K508DEtOjyt8dCwz2H8dTl0n4mrZDH0Tuftpu2ji/VyBjoMKCUx
AFQkAdVJhx1nHr2u5KTxvq35hLeVicPcSlv0DAXzbXoEgrYYSFyx16bCW02T1ZKd9RSfTcWhUXLo
WvpNg3y0sIUs1kFsAqO43lbYmVdMqDn/qOMHH+HuQf+/xXd2NWL9M9uZbDjLxBYstA7SMo5oqLQ=
EOF

FILE --encoding=hex /res2
08ef f0a4 3461 d319 a04c 72d3 7919 a4a6
6761 3f4c 14a6 ab72 3f14 7726 22bc 1449
EOF

FILE --eof=END /res3
just
text
file
END


FROM alpine

RUN \
  --mount=type=bind,from=resources,source=res1,dst=/tmp/run.sh \
  --mount=type=bind,from=resources,source=res2,dst=/tmp/key.bin \
  cd /tmp && /tmp/run.sh

What i want to say. The HEREDOC-like directive should work like a ADD/COPY.

p.s.: Yes, it's buildkit build mounts syntax.

@intgr
Copy link

intgr commented Dec 20, 2020

No, all decisions to add a runnable script to HEREDOC are not viable or extremely difficult.

@chazer Why?

@RUBenGAMArrarodRiguEZ-ToMtOm
Copy link

No, all decisions to add a runnable script to HEREDOC are not viable or extremely difficult.

@chazer Why all are inviable and/or extremely difficult?

@pwillis-els
Copy link

pwillis-els commented Jan 21, 2021

It's not extremely difficult (I don't even write Go and I figured it out). But I'm also extremely lazy and don't really need the feature, so hopefully somebody else can look at the source and hack up a bad patch as a demo.

This evaluator seems to use buildkit instructions and parser to deconstruct the dockerfile into Go objects. Some hacks on those three files (maybe more) should get you a basic Heredoc implementation.

Funny, since I've never read the source, I didn't know you could give directives to the parser. Apparently putting

# escape=`

as the top line instructs the parser what the escape character is. Somebody could hack up a patch and use that functionality to feature-flag the Heredoc syntax and cut a beta build for testing.

I would personally use the syntax for: in-lining files when using 'pipe dockerfile to docker build' functionality, multi-line run commands without the need to escape everything, and templating/linting in-lined code. Heredoc allows you to massively simplify these things versus what you'd have to do today.

edit thaJeztah mentioned in a different issue:

The proposed here-doc syntax could be implemented as a custom front-end, and very likely would be accepted for inclusion in the experimental Dockerfile syntax (before promoting it to the "stable" syntax). Contributions for that are welcome if someone wants to work on that.

So.... there you have it! Some neophyte Go coder just needs to scratch out a barely-working patch and it very likely will get accepted, after a fashion. Who wants to add "I implemented Heredoc in Dockerfiles" on their resume? 😉

@tonistiigi
Copy link
Member

tonistiigi commented Jun 11, 2021

moby/buildkit#2132 has been merged to the labs channel https://github.com/moby/buildkit/blob/b0c769b97eb8ea29e3ce1a8c0a8d230b88256c9a/frontend/dockerfile/docs/syntax.md#here-documents

@esatterwhite
Copy link

moby/buildkit#2132 has been merged to the labs channel https://github.com/moby/buildkit/blob/b0c769b97eb8ea29e3ce1a8c0a8d230b88256c9a/frontend/dockerfile/docs/syntax.md#here-documents

Did this ever get merged back into mainline? Do I have to keep using the labs image?
Is there a cadence for merging features from labs back into mainline?

@thaJeztah
Copy link
Member Author

Yes this is available now in the stable syntax (# syntax=docker/dockerfile:1) when using BuildKit (added in docker/dockerfile:1.4);

https://docs.docker.com/engine/reference/builder/#here-documents

Screenshot 2023-04-12 at 21 48 23

@esatterwhite
Copy link

# syntax=docker/dockerfile:1

It is necessary to include this in the docker file, or this is the default?

@thaJeztah
Copy link
Member Author

The default depends on the version of the Engine; docker 23.0 will have it by default, but 20.10 won't. So in general, I'd recommend starting your Dockerfile with a syntax directive; doing so makes sure that the syntax is supported regardless of the Engine you're building on, and will update the Dockerfile parser to the latest stable syntax before building. (see https://docs.docker.com/build/buildkit/dockerfile-frontend/)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/builder exp/expert exp/intermediate kind/feature Functionality or other elements that the project doesn't currently have. Features are new and shiny
Projects
None yet
Development

No branches or pull requests