Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add the image platform to the images/json endpoint #45735

Closed
wants to merge 1 commit into from

Conversation

rumpl
Copy link
Member

@rumpl rumpl commented Jun 13, 2023

- What I did

This is an attempt to move #44582 forward.

This change adds the platform to the ImageSummary struct and in the response to GET images/json.

We had a quick discussion with @thaJeztah about the platform. There are basically two ways we could do this:

  • add the platform (like in this change)
  • flip the representation on its head to match more how manifests/manifest lists work

The first one is easy and backwards compatible (i.e. it doesn't break old CLIs). Flipping the representation on its head would mean that we the images/json endpoint would return a slice of different objects and I'm not sure we should do this.

Even if we want to change the representation of an image in the API we could use this change IMO.

Should we gate this behind a version check? I didn't add any because it's backwards compatible

Relates to #44582

- How to verify it

Run docker images (nothing changes, listing images should still work):

$ docker images
REPOSITORY    TAG       IMAGE ID       CREATED      SIZE
alpine        latest    02bb6f428431   3 days ago   12.7MB
hello-world   latest    fc6cf906cbfa   4 days ago   20.8kB
hello-world   latest    fc6cf906cbfa   4 days ago   17.5kB
test          latest    bd44db8bb70d   5 days ago   1.13kB
rumpl/test    latest    da2ffa013983   5 days ago   1.13kB

Or with curl:

$ curl --unix /.... http://localhost/images/json | jq
[
  {
    "Containers": -1,
    "Created": 1686325187,
    "Id": "sha256:02bb6f428431fbc2809c5d1b41eab5a68350194fb508869a33cb1af4444c9b11",
    "Labels": null,
    "ParentId": "",
    "RepoDigests": [
      "docker.io/library/alpine@sha256:02bb6f428431fbc2809c5d1b41eab5a68350194fb508869a33cb1af4444c9b11"
    ],
    "RepoTags": [
      "docker.io/library/alpine:latest"
    ],
    "SharedSize": -1,
    "Size": 12716511,
    "Platform": {
      "architecture": "arm64",
      "os": "linux",
      "variant": "v8"
    }
  },
  {
    "Containers": -1,
    "Created": 1686215273,
    "Id": "sha256:fc6cf906cbfa013e80938cdf0bb199fbdbb86d6e3e013783e5a766f50f5dbce0",
    "Labels": null,
    "ParentId": "",
    "RepoDigests": [
      "docker.io/library/hello-world@sha256:fc6cf906cbfa013e80938cdf0bb199fbdbb86d6e3e013783e5a766f50f5dbce0"
    ],
    "RepoTags": [
      "docker.io/library/hello-world:latest"
    ],
    "SharedSize": -1,
    "Size": 20836,
    "Platform": {
      "architecture": "amd64",
      "os": "linux"
    }
  },
  {
    "Containers": -1,
    "Created": 1686215273,
    "Id": "sha256:fc6cf906cbfa013e80938cdf0bb199fbdbb86d6e3e013783e5a766f50f5dbce0",
    "Labels": null,
    "ParentId": "",
    "RepoDigests": [
      "docker.io/library/hello-world@sha256:fc6cf906cbfa013e80938cdf0bb199fbdbb86d6e3e013783e5a766f50f5dbce0"
    ],
    "RepoTags": [
      "docker.io/library/hello-world:latest"
    ],
    "SharedSize": -1,
    "Size": 17488,
    "Platform": {
      "architecture": "arm64",
      "os": "linux",
      "variant": "v8"
    }
  },
  {
    "Containers": -1,
    "Created": 1686145234,
    "Id": "sha256:bd44db8bb70d20e6d653a00fd1b6710a826398621d2bb3de9d9c77ff1fdd968f",
    "Labels": null,
    "ParentId": "",
    "RepoDigests": [
      "docker.io/library/test@sha256:bd44db8bb70d20e6d653a00fd1b6710a826398621d2bb3de9d9c77ff1fdd968f"
    ],
    "RepoTags": [
      "docker.io/library/test:latest"
    ],
    "SharedSize": -1,
    "Size": 1129,
    "Platform": {
      "architecture": "amd64",
      "os": "linux"
    }
  },
  {
    "Containers": -1,
    "Created": 1686145120,
    "Id": "sha256:da2ffa0139834461fa0efa258c40e60959a4e3f731ee91e7ccbccfe880ec5c2a",
    "Labels": null,
    "ParentId": "",
    "RepoDigests": [
      "docker.io/rumpl/test@sha256:da2ffa0139834461fa0efa258c40e60959a4e3f731ee91e7ccbccfe880ec5c2a"
    ],
    "RepoTags": [
      "docker.io/rumpl/test:latest"
    ],
    "SharedSize": -1,
    "Size": 1129,
    "Platform": {
      "architecture": "amd64",
      "os": "linux"
    }
  }
]

- Description for the changelog

- A picture of a cute animal (not mandatory but encouraged)

@rumpl rumpl added area/api area/images kind/enhancement Enhancements are not bugs or new features but can improve usability or performance. impact/changelog labels Jun 13, 2023
@rumpl rumpl marked this pull request as draft June 13, 2023 08:35
@rumpl rumpl marked this pull request as ready for review June 13, 2023 08:47
@rumpl rumpl requested a review from tianon as a code owner June 13, 2023 08:47
Signed-off-by: Djordje Lukic <djordje.lukic@docker.com>
// The image platform
//
// Required: false
Platform *ocispec.Platform `json:"Platform,omitempty"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should look if we want the Platform embedded of an explicit sub-struct here.

i.e.;

{"Name":"myimage","Platform":{"architecture":"arm","os":"linux","variant":"v7"}}

or

{"Name":"myimage","architecture":"arm","os":"linux","variant":"v7"}

Embedding would also the fields to be accessed both through image.OS, image.Architecture (etc), as well as image.Platform, and in cases where we only need the platform, would allow to unmarshal directly into a Platform (see containerd/containerd#7376)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the confusing / wrong bit here is that the Image ID (with containerd integration) refers to the digest what we resolved the image to, which is the Manifest index (for multi-arch images), but the Platform now only contains a single Platform.

This also introduces the problem shown in your example, where images have the same name, same ID, but are not unique;

REPOSITORY    TAG       IMAGE ID       CREATED      SIZE
..
hello-world   latest    fc6cf906cbfa   4 days ago   20.8kB
hello-world   latest    fc6cf906cbfa   4 days ago   17.5kB
..

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The id issue already exists but only for the containerd integration, we shouldn't talk about this here (I did try though and set the manifest Id but then searching for that manifest directly is not that easy; or at least I didn't want to take too much time doing that right now).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could do embedding indeed, image inspect does give a JSON with an embedded platform.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about embedding it in the structure but still keep the Platform JSON property?

// The image platform
//
// Required: false
*ocispec.Platform `json:"Platform,omitempty"`

@thaJeztah
Copy link
Member

Should we gate this behind a version check? I didn't add any because it's backwards compatible

If it's easy to do, we probably should; the reason we don't return new fields (even though we have an "open schema", so the API is technically "allowed" to return additional fields), is to prevent users from expecting the field to be there (i.e., they try GET /v1.10/images/json on one machine, expect the platform information to be there, and on another machine it isn't, even though they're using the same API version on both.

@thaJeztah
Copy link
Member

Quick blurb as I'm about to head out, but we could do a "hybrid" approach;

image:
  id: <digest of manifest index or image (single arch)>
  name:
  images:
    - id: <digest of image variant>
      os: linux
      arch: arm64
    - id: <digest of  image variant>
      os: linux
      arch: amd64
    - id: <etcetera>

The "wrapper" image would show details for the default platform, but we can switch the information if --platform is specified, in which case images: could also be filtered to only have that variant (?)

Some "fun" things to think about; while usually the variants would differ in architecture and os, technically it's possible they all have different configs (different command, different env etc), so if we need to present that information, the "nested" (images) slice would potentially each have all details of each variant.

@rumpl
Copy link
Member Author

rumpl commented Jun 13, 2023

Quick blurb as I'm about to head out, but we could do a "hybrid" approach;

What would that look like in the case of a single-platform image?

@thaJeztah
Copy link
Member

What would that look like in the case of a single-platform image?

I guess we could go either way; duplicate the info in the slice, or keep the slice empty (nothing to see here; just present the one and only image)

I guess filling the slice would make sense though if we do want to show the platform info somewhere (and don't want to add that to the outer wrapper image, and older clients have no use for platform in the outer image)

@rumpl rumpl changed the title Add the image platform to the images/json endpoint Add the image platform to the images/json endpoint Jun 14, 2023
@tianon
Copy link
Member

tianon commented Jun 14, 2023

Surprisingly, I noticed these duplicated images entries for the first time today 😅

The first one is easy and backwards compatible (i.e. it doesn't break old CLIs). Flipping the representation on its head would mean that we the images/json endpoint would return a slice of different objects and I'm not sure we should do this.

I would argue that what we're doing now is actually not backwards compatible because we're returning multiple entries with the same exact name, which I'd argue was technically in the "undefined behavior" territory for the old implementation (where that would never happen unless there was a bug in the daemon, so wasn't something anyone would've been testing/expecting).

@tianon
Copy link
Member

tianon commented Jun 15, 2023

Because @neersighted told me it's useful, I'm attempting to write up a more concise version of some of the thoughts I shared during today's meeting. 🙇

At the high level, I'll reiterate what I've said above: I'm very opposed to the current behavior of returning multiple entries with the same name -- it's technically API compatible on the path/struct, but on non-containerd-integration that would strictly only happen when the daemon had a (serious) data integrity bug, so it is not unreasonable to imagine that the assumption that those are unique is implicitly embedded in other applications and thus by violating that assumption we have caused potential undefined behavior for those applications as well.

I'm not opposed to @thaJeztah's idea of embedding a new field that describes the "children" of a given entry in the image service (and then having the outer structure summarize the children as best as it can given the lossy/incongruent data model), but I do think it is really important that we think very carefully about the underlying OCI / image-spec and even containerd data models when designing that data structure so that we don't end up in the same situation we've had over the years with the current structure (crufty fields that no longer have meaning, confusing fields, confusion over how they map to the OCI spec fields, etc), and make sure we're making a conscious choice there around how to model objects that aren't necessarily 1-to-1 container images as we know and love them today (WASM bundles, SBOMs, signatures, etc), even if that choice is not to model them at all.

What we're essentially talking about is an expanded/"unwrapped" OCI index where all the children are expanded implicitly, but also include other extra metadata (size in the content store, size in each relevant snapshotter, etc). When talking about "size", I think users look to this API/CLI to see "how much space does this image take up" (which is a little fuzzy given layer sharing, but roughly accurate for a single image), and in that view this would also need to account for things that refer to this image (either the image index or any one of the image manifests inside), because those are things that should go away when the last gc-ref'd image they point to does (containerd/containerd#7654 (comment)).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/api area/images impact/changelog kind/enhancement Enhancements are not bugs or new features but can improve usability or performance.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants