Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Whiteouts in layers #2

Closed
shykes opened this issue Jan 21, 2013 · 5 comments
Closed

Whiteouts in layers #2

shykes opened this issue Jan 21, 2013 · 5 comments

Comments

@shykes
Copy link
Contributor

shykes commented Jan 21, 2013

Will layers store whiteouts (a list of files to remove) in addition to added files?

If I remove /foo in a container, and export those changes into a new layer, what happens when I run a new container using that layer? Either a) /foo will be present (whiteouts are not stored in the layer), or b) /foo will not be present (whiteouts are stored in the layer).

Question 1: are whiteouts needed? Are certain use cases only possible with whiteouts?

Question 2: if we do need whiteouts in layer, what format should we use? We could use AUFS's convention - effectively making aufs part of the api, instead of just an implementation detail. Do we want that?

@jpetazzo
Copy link
Contributor

There might be (at least) one scenario where you want to have whiteouts: if you ship a layer including some sample code, you might want to remove it (or overwrite it with something else) when you derive the layer.

On the current platform, we don't use whiteouts in that case, because this data is on a separate volume (therefore not requiring a whiteout).

There are also special cases of whiteouts: permission changes. Those can be useful when you want to give access to some files, or remove access. This was a frequent problem on the current platform (e.g. readability of logs when daemons are installed using system packages instead of the default user).

There is no convention in AUFS—just an opaque format. I'd personally advocate against making that part of the API, since it effectively ties the implementation with AUFS, which is probably a Very Bad Thing, but YMMV.

Last but not least, in some cases (mail servers), you might also want to preserve inode numbers—and in that case, you would have to also preserve the xino translation file of AUFS.

... Unless we want to support bind-mounted volumes; then those concerns are waived.

@shykes
Copy link
Contributor Author

shykes commented Jan 21, 2013

On Sun, Jan 20, 2013 at 10:57 PM, jpetazzo notifications@github.com wrote:

There might be (at least) one scenario where you want to have whiteouts:
if you ship a layer including some sample code, you might want to remove it
(or overwrite it with something else) when you derive the layer.

Could you walk me through an example use case in more detail?

There are also special cases of whiteouts: permission changes. Those can
be useful when you want to give access to some files, or remove access.
This was a frequent problem on the current platform (e.g. readability of
logs when daemons are installed using system packages instead of the
default user).

Same question as above.

Last but not least, in some cases (mail servers), you might also want to
preserve inode numbers—and in that case, you would have to also preserve
the xino translation file of AUFS.

Same question as above.

Thanks!

@shykes
Copy link
Contributor Author

shykes commented Jan 21, 2013

On Sun, Jan 20, 2013 at 10:57 PM, jpetazzo notifications@github.com wrote:

There is no convention in AUFS—just an opaque format. I'd personally
advocate against making that part of the API, since it effectively ties the
implementation with AUFS, which is probably a Very Bad Thing, but YMMV.

We are in agreement here. Frankly, sticking to vanilla tarballs as the
image format is something I would like to preserve if at all possible. I know
Andrea feels the same way.

@jpetazzo
Copy link
Contributor

  1. Whiteouts Needed To Remove Sample Code.
    The current implementation of dotCloud service ships with some sample code. It is never seen by dotCloud users, since their code replaces it immediately after container deployment (before the container gets a chance to see traffic) — unless code deployment fails; in which case the container is rolled back to its previous version, which happens to be the pristine version of the cloudlet. Of course, we could argue that this sample code could be removed in the first place from the base layer; but it's here for a good reason: to help the service developer to ensure that his service actually works.
  2. Whiteouts Needed To Preserve Permission Changes.
    For a very long time, dotCloud Java service has been plagued by the fact that its logs were readable only by the jetty user, not by the dotCloud user. The files in base layer didn't have to be modified, but their permissions did.
  3. Preserving Inode Numbers.
    http://grox.net/doc/postfix/html/faq.html#copying
    "Postfix names a queue file after its inode number and after the microsecond part of the time of day. Thus, if a queue file has a name based on someone elses inode number there is a small chance that the file name will collide with another queue file."

@shykes
Copy link
Contributor Author

shykes commented Jan 27, 2013

Ok, after a few off-band discussions: the consensus seems to be that whiteouts and layers are a (very) useful tool, but should not be exposed as a top-level object to the end-user. The main reason being that it would make things unnecessarily complicated (every user needs to acquire the off-band knowledge of which layer works with which, and it is very easy to shoot yourself in the foot by combining layers that were not meant to be combined). At the same time, there is not much benefit to allowing arbitrary combination of layers: even if you are 99% sure your layer can be safely "rebased" to a new base layer, why not run that build script again just to make sure you're right?

So, instead, let's use layers (+ their associated whiteouts) as a optimization for storing and transferring images. In other words, the payload for docker is always a tarball. But it may optionally be a sparse tarball, which will reference the bottom layers by ID, and only store changes. Of course these sparse tarballs will only be usable to a receiver capable of (a) decoding the sparse format and (b) retrieving the bottom layers from their ID.

@shykes shykes closed this as completed Feb 25, 2013
crosbymichael referenced this issue in crosbymichael/docker Nov 5, 2013
Hack: Add lvm2 static compilation to Dockerfile
vieux added a commit that referenced this issue Mar 3, 2014
add a little bit mentioning commandline option combinations

Docker-DCO-1.1-Signed-off-by: Victor Vieux <victor.vieux@docker.com> (github: vieux)
@ajwdev ajwdev mentioned this issue May 8, 2014
proppy pushed a commit to proppy/docker that referenced this issue Jul 24, 2014
Miscellaneous fixes and update for Remote API v1.10
jlhawn referenced this issue in jlhawn/docker Aug 28, 2014
Retrofit of registry v2 client
MalteJ added a commit to MalteJ/docker that referenced this issue Nov 4, 2014
Signed-off-by: Malte Janduda <mail@janduda.net>
thaJeztah referenced this issue in thaJeztah/docker Jul 5, 2018
[18.06] Update Microsoft/go-winio to 0.4.8
kolyshkin referenced this issue in kolyshkin/moby Jul 12, 2018
When go-1.11beta1 is used for building, the following error is
reported:

> 14:56:20 daemon\graphdriver\lcow\lcow.go:236: Debugf format %s reads
> arg #2, but call has 1 arg

While fixing this, let's also fix a few other things in this
very function (startServiceVMIfNotRunning):

1. Do not use fmt.Printf when not required.
2. Use `title` whenever possible.
3. Don't add `id` to messages as `title` already has it.
4. Remove duplicated colons.
5. Try to unify style of messages.
6. s/startservicevmifnotrunning/startServiceVMIfNotRunning/
...

In general, logging/debugging here is a mess and requires much more
love than I can give it at the moment. Areas for improvement:

1. Add a global var logger = logrus.WithField("storage-driver", "lcow")
and use it everywhere else in the code.
2. Use logger.WithField("id", id) whenever possible (same for "context"
and other similar fields).
3. Revise all the errors returned to be uniform.
4. Make use of errors.Wrap[f] whenever possible.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
andreikom pushed a commit to andreikom/moby that referenced this issue Aug 4, 2019
g0194776 pushed a commit to g0194776/moby that referenced this issue Nov 14, 2019
tussennet pushed a commit to tussennet/moby that referenced this issue Sep 4, 2020
Introduce a Makefile for building on FreeBSD
nosamad referenced this issue in WAGO/docker-engine Sep 23, 2021
60437cbde7 Merge pull request #2 from WAGO/feat/oelh/fix-version
69c1d2a556 Change component actions to release version v1.0
934e84a197 Merge pull request #1 from WAGO/feat/oelh/extracted
a04bdaa494 Extract common docker actions

git-subtree-dir: .github/docker-actions
git-subtree-split: 60437cbde75455f2248a81de61dc32264aacca26
thaJeztah added a commit that referenced this issue Oct 4, 2021
yousong pushed a commit to yousong/moby that referenced this issue Apr 27, 2022
yousong pushed a commit to yousong/moby that referenced this issue Apr 27, 2022
moby#2)

* fix issue that push empty logs to logstore and change vendor updating method

* update vendor
thaJeztah added a commit that referenced this issue Mar 14, 2024
…f v1.5.4

full diffs:

- protocolbuffers/protobuf-go@v1.31.0...v1.33.0
- golang/protobuf@v1.5.3...v1.5.4

From the Go security announcement list;

> Version v1.33.0 of the google.golang.org/protobuf module fixes a bug in
> the google.golang.org/protobuf/encoding/protojson package which could cause
> the Unmarshal function to enter an infinite loop when handling some invalid
> inputs.
>
> This condition could only occur when unmarshaling into a message which contains
> a google.protobuf.Any value, or when the UnmarshalOptions.UnmarshalUnknown
> option is set. Unmarshal now correctly returns an error when handling these
> inputs.
>
> This is CVE-2024-24786.

In a follow-up post;

> A small correction: This vulnerability applies when the UnmarshalOptions.DiscardUnknown
> option is set (as well as when unmarshaling into any message which contains a
> google.protobuf.Any). There is no UnmarshalUnknown option.
>
> In addition, version 1.33.0 of google.golang.org/protobuf inadvertently
> introduced an incompatibility with the older github.com/golang/protobuf
> module. (golang/protobuf#1596) Users of the older
> module should update to github.com/golang/protobuf@v1.5.4.

govulncheck results in our code:

    govulncheck ./...
    Scanning your code and 1221 packages across 204 dependent modules for known vulnerabilities...

    === Symbol Results ===

    Vulnerability #1: GO-2024-2611
        Infinite loop in JSON unmarshaling in google.golang.org/protobuf
      More info: https://pkg.go.dev/vuln/GO-2024-2611
      Module: google.golang.org/protobuf
        Found in: google.golang.org/protobuf@v1.31.0
        Fixed in: google.golang.org/protobuf@v1.33.0
        Example traces found:
          #1: daemon/logger/gcplogs/gcplogging.go:154:18: gcplogs.New calls logging.Client.Ping, which eventually calls json.Decoder.Peek
          #2: daemon/logger/gcplogs/gcplogging.go:154:18: gcplogs.New calls logging.Client.Ping, which eventually calls json.Decoder.Read
          #3: daemon/logger/gcplogs/gcplogging.go:154:18: gcplogs.New calls logging.Client.Ping, which eventually calls protojson.Unmarshal

    Your code is affected by 1 vulnerability from 1 module.
    This scan found no other vulnerabilities in packages you import or modules you
    require.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants