Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds build option to squash newly built layers #9591

Closed
wants to merge 1 commit into from

Conversation

jlhawn
Copy link
Contributor

@jlhawn jlhawn commented Dec 10, 2014

This patch brings support for a --squash option to docker build.
If used, before a build completes like it normally would, it consolidates all
newly built layers, making all build steps result in only a single new image.

THIS DOES NOT AFFECT BUILD CACHE IN ANY WAY!

That's right, you still get all the advantages of the build cache (assuming
you don't specify --no-cache that is). An image built with the --squash
option still produces a layer for each built step, but also a second image
which has all of the changes consolidated into one layer.

If you use the --tag option along with --squash, it is the squashed layer
which gets tagged.

Docker-DCO-1.1-Signed-off-by: Josh Hawn josh.hawn@docker.com (github: jlhawn)

This patch brings support for a `--squash` option to `docker build`.
If used, before a build completes like it normally would, it consolidates all
newly built layers, making all build steps result in only a single new image.

THIS DOES NOT AFFECT BUILD CACHE IN ANY WAY!

That's right, you still get all the advantages of the build cache (assuming
you don't specify `--no-cache` that is). An image built with the `--squash`
option still produces a layer for each built step, but also a second image
which has all of the changes consolidated into one layer.

If you use the `--tag` option along with `--squash`, it is the squashed layer
which gets tagged.

Docker-DCO-1.1-Signed-off-by: Josh Hawn <josh.hawn@docker.com> (github: jlhawn)
@jlhawn
Copy link
Contributor Author

jlhawn commented Dec 10, 2014

addresses #6906 #332 and probably others

@jlhawn
Copy link
Contributor Author

jlhawn commented Dec 10, 2014

notably missing from this change: keeping command history

This should be relatively easy to salvage by simply walking the history up to the ancestor being squashed to and writing out a new runconfig where the "Cmd" entry is some combination of all of the steps being squashed. I don't want to work on that yet though. I'll wait to see if this PR gets any traction first - squash PRs have been submitted before and none that I am aware of seem to have gotten any traction.

@jlhawn
Copy link
Contributor Author

jlhawn commented Dec 10, 2014

Obligatory usage example:

$ docker build -t jlhawn/busybox-squash --squash .
Sending build context to Docker daemon 21.54 MB
Sending build context to Docker daemon 
Step 0 : FROM scratch
 ---> f60c56784b83
Step 1 : MAINTAINER Jérôme Petazzoni <jerome@docker.com>
 ---> Using cache
 ---> a7b8b4122099
Step 2 : ADD rootfs.tar /
 ---> Using cache
 ---> a936027c5ca8
Step 3 : CMD /bin/sh
 ---> Using cache
 ---> 5785b62b697b
Squashing image ID "5785b62b697b99a5af6cd5d0aabc804d5748abbb6d3d07da5d1d3795f2dcc83e" up to "f60c56784b832dd990022afc120b8136ab3da9528094752ae13fe63a2d28dc8c"
Successfully built 8a112083c04a

@cpuguy83
Copy link
Member

I think I would rather see this in docker push where the image is squashed up to (but not including) it's parent before pushing rather than storing 2 local copies of the image.

@SvenDowideit
Copy link
Contributor

@jlhawn you're going to write the docs for this?

@jlhawn
Copy link
Contributor Author

jlhawn commented Dec 16, 2014

@SvenDowideit I'll write up the docs for this, but I want to make sure people are agreeable to the feature the way I've implemented it before I continue working on it.

@jlhawn
Copy link
Contributor Author

jlhawn commented Dec 16, 2014

@nathanleclaire @tiborvass @huslage @crosbymichael @jfrazelle and others - what are your thoughts on this implementation of squashing build layers?

@SvenDowideit
Copy link
Contributor

to counterpoint @cpuguy83 's comment - I don't docker push much, but there are times when I would still like to have a squash happen locally - and segmenting up my Dockerfiles so that each could be a candidate for squashing will work for me. (as I then chain them using Makefiles)

@cpuguy83
Copy link
Member

@SvenDowideit Well, we all push frequently, just maybe not directly.
Automated builds and library builds would be squashed on push.

@tianon
Copy link
Member

tianon commented Dec 17, 2014

I'm kind of a fan of this solution, specifically because you addressed my concerns about squashing: I love the Docker cache, and make heavy use of it.

Now, I can't pull the squashed image and have it be a cache-fill, but that's an obvious problem we'd have to run into with this (no way around it).

@yosifkit
Copy link
Contributor

@cpuguy83, I think moving the squash to the push code would make an already unbearably slow process even slower (all official images: 914m29.924s, possibly improved recently). Adding in what is really a build step to the push code seems to wrong.

@jlhawn, how would this new squashed layer work with the docker cache, meaning if I build 30 times with no changes will I get 30 different merged layers created? This is especially important in the context of the Official Builds since all rebuilds are controlled through docker caching of build steps.

@jlhawn
Copy link
Contributor Author

jlhawn commented Dec 17, 2014

@yosifkit you bring up a good point! This squashed layer is currently not being cached - even if all the build steps are cached, adding the --squash flag would result in a completely new layer each time. It should be pretty easy for me to make it cacheable though, using a similar process that the builder uses with every other layer.

@yosifkit
Copy link
Contributor

As long as it is cacheable too, that works for Official Images. 😄

On the other hand, I am not certain we would even want to squash the Official Images, since people already complain about the huge 1.3GB layer in golang:cross. This would also prevent multiple tags of the same image from sharing any of their layers.

@jlhawn
Copy link
Contributor Author

jlhawn commented Dec 17, 2014

@yosifkit how much of the golang:cross image's 1.3GB layer is lost in the virtual size of the image (i.e., does it ADD a bunch of stuff in one build step that is then removed later)? or how much of it is just stale artifacts from build steps which could be removed and squashed away?

@jlhawn
Copy link
Contributor Author

jlhawn commented Dec 17, 2014

@yosifkit said:

This would also prevent multiple tags of the same image from sharing any of their layers.

If you know some of the images are sharing layer then you could use an intermediate base image which squashed all shared layers into one.

@yosifkit
Copy link
Contributor

No, that is all just compiling go for all the architectures in one RUN:

RUN cd /usr/src/go/src \
    && set -ex \
    && for platform in $GOLANG_CROSSPLATFORMS; do \
        GOOS=${platform%/*} \
        GOARCH=${platform##*/} \
        ./make.bash --no-clean 2>&1; \
    done

I am mostly talking about sharing within the same image, like golang 1.3 and 1.4 both starting with:

RUN apt-get update && apt-get install -y \
        ca-certificates curl gcc libc6-dev make \
        bzr git mercurial \
        --no-install-recommends \
    && rm -rf /var/lib/apt/lists/*

It seems like too much overhead to make another image just for the common parts of of the golang images, just to keep the one layer (possibly every image would each need a common base so that layers could still be common between versions).

@vincentwoo
Copy link
Contributor

👍 would love to see docker-native support layer squashing

@rhatdan
Copy link
Contributor

rhatdan commented Jan 30, 2015

We are seeing renewed interest in squashing with our customers and within Red Hat. We are seeing an explosion of layers and ass we add the LABEL construct it is going to get worse. Can we move this along.

@vincentwoo
Copy link
Contributor

@rhatdan you understand that your hat in the photo is yellow right

@rhatdan
Copy link
Contributor

rhatdan commented Jan 30, 2015

That is my OpenSource Hat.

@thaJeztah
Copy link
Member

you understand that your hat in the photo is yellow right

😂 @vincentwoo that actually made me laugh :)

@anandkumarpatel
Copy link
Contributor

+1

@icecrime
Copy link
Contributor

@jlhawn What do you want to do with this PR? I still believe squashing is desirable, but this is not moving, and I'm afraid rebasing is going to be hell. Let us know!

@jlhawn
Copy link
Contributor Author

jlhawn commented Mar 17, 2015

@icecrime I think I'll close this PR for now. I want to try another approach that doesn't overload the build command.

@vincentwoo
Copy link
Contributor

@jlhawn is there a spiritual successor to this on the way? I haven't seen any sort of messaging about whether the Docker team considers squashing to be worth doing, or where it is on the roadmap.

@jlhawn
Copy link
Contributor Author

jlhawn commented Jun 13, 2015

@vincentwoo thanks for reminding me about this. Sorry, I've been very busy lately :(.

Even though the builder option didn't work out, It shouldn't be too much work for me to port this over to a standalone command like:

docker squash [-t <new_tag>] <img> [<ancestor_img>]

I'll let you know when I have something.

@jlhawn
Copy link
Contributor Author

jlhawn commented Jun 14, 2015

@cpuguy83 @SvenDowideit @tianon @yosifkit @vincentwoo @rhatdan @thaJeztah @anandkumarpatel @icecrime if #13929 doesn't make people happy then I don't know what will. 😛

@jlhawn jlhawn deleted the build_squash_cache branch July 31, 2015 18:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet