Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Descriptors, the platform object, and the artifactType #1131

Open
neersighted opened this issue Sep 26, 2023 · 5 comments · May be fixed by #1141
Open

Descriptors, the platform object, and the artifactType #1131

neersighted opened this issue Sep 26, 2023 · 5 comments · May be fixed by #1141

Comments

@neersighted
Copy link

neersighted commented Sep 26, 2023

In the aftermath of #999/the removal of the artifact mediaType, I believe that the guidance for descriptors has become misleading.

Artifacts are now represented using the standard image manifest and mediaType:

image-spec/manifest.md

Lines 23 to 27 in 93f6e65

- **`mediaType`** *string*
This property SHOULD be used and [remain compatible](media-types.md#compatibility-matrix) with earlier versions of this specification and with other similar external formats.
When used, this field MUST contain the media type `application/vnd.oci.image.manifest.v1+json`.
This field usage differs from the [descriptor](descriptor.md#properties) use of `mediaType`.

When we had a separate mediaType/artifact manifest, the following line in the descriptor specification was informative, as it implies that "images" (read: runnable content) should have a platform, but "artifacts" (arbitrary content) may or may not:

Descriptors pointing to [`application/vnd.oci.image.manifest.v1+json`](manifest.md) SHOULD include the extended field `platform`, see [Image Index Property Descriptions](image-index.md#image-index-property-descriptions) for details.

However, now that there is one manifest mediaType, this advice appears to be rather unhelpful; all manifests are now the same type, so this implies that all manifests SHOULD have a platform in their descriptor.

In practice, this has lead to implementations including/believing they should include {"os": "unknown", "architecture": unknown"} when describing non-image content (artifacts), which I believe to be misguided/incorrect, as values for these fields SHOULD be understood by the Go toolchain:

- **`architecture`** *string*
This REQUIRED property specifies the CPU architecture.
Image indexes SHOULD use, and implementations SHOULD understand, values listed in the Go Language document for [`GOARCH`][go-environment2].
- **`os`** *string*
This REQUIRED property specifies the operating system.
Image indexes SHOULD use, and implementations SHOULD understand, values listed in the Go Language document for [`GOOS`][go-environment2].

I'd like to drop this line from the descriptor spec, or amend it to acknowledge artifactType, and the config.mediaType fallback (collectively "artifacts"); but between these options I have no strong preference.

@tianon
Copy link
Member

tianon commented Sep 26, 2023

At the risk of oversimplifying (please correct me if I've misunderstood!), you're proposing we loosen/tighten the language around platform to make sure it's clear it doesn't (have to) apply to non-image artifacts (even though they technically share the image mediaType)?

@neersighted
Copy link
Author

Yes, that's essentially my thought. When artifacts had a separate mediaType, they did not have a SHOULD for the platform; now that they share a mediaType, this applies.

I think the state before the separate mediaType/manifest definition was dropped made more sense, and that this line only ended up applying to artifacts by happenstance. I think usage of platform in the descriptor should be supported/encouraged where it makes sense, but requiring all artifacts to specify a "nonsense" platform (read: meaningless to the Go toolchain) doesn't make sense to me.

@tianon
Copy link
Member

tianon commented Sep 27, 2023

Yeah, that seems sensible IMO; any idea how runtimes like Docker would react to a index member that doesn't have a platform object? (Thinking of the case where the index author was not as careful as they should have been about order and put the artifacts before the relevant images that should match first/instead.)

@neersighted
Copy link
Author

It's definitely highly runtime-dependent, but I do think runtimes should be updating their checks to account for "artifacts" (artifactType/config.mediaType). However, you make a good point.


Examining the artifacts decision tree, and comparing with dockerd behavior:

image-spec/manifest.md

Lines 170 to 262 in 93f6e65

## Guidelines for Artifact Usage
Content other than OCI container images MAY be packaged using the image manifest.
When this is done, the `config.mediaType` value MUST be set to a value specific to the artifact type or the [empty value](#guidance-for-an-empty-descriptor).
If the `config.mediaType` is set to the empty value, the `artifactType` MUST be defined.
If the artifact does not need layers, a single layer SHOULD be included with a non-zero size.
The suggested content for an unused `layers` array is the [empty descriptor](#guidance-for-an-empty-descriptor).
The design of the artifact depends on what content is being packaged with the artifact.
The decision tree below and the associated examples MAY be used to design new artifacts:
1. Does the artifact consist of at least one file or blob?
If yes, continue to 2.
If no, specify the `artifactType`, and set the `config` and a single `layers` element to the empty descriptor value.
Here is an example of this with annotations included:
```json,title=Minimal%20artifact&mediatype=application/vnd.oci.image.manifest.v1%2Bjson
{
"schemaVersion": 2,
"mediaType": "application/vnd.oci.image.manifest.v1+json",
"artifactType": "application/vnd.example+type",
"config": {
"mediaType": "application/vnd.oci.empty.v1+json",
"digest": "sha256:44136fa355b3678a1146ad16f7e8649e94fb4fc21fe77e8310c060f61caaff8a",
"size": 2
},
"layers": [
{
"mediaType": "application/vnd.oci.empty.v1+json",
"digest": "sha256:44136fa355b3678a1146ad16f7e8649e94fb4fc21fe77e8310c060f61caaff8a",
"size": 2
}
],
"annotations": {
"oci.opencontainers.image.created": "2023-01-02T03:04:05Z",
"com.example.data": "payload"
}
}
```
2. Does the artifact have additional JSON formatted metadata as configuration?
If yes, continue to 3.
If no, specify the `artifactType`, include the artifact in the `layers`, and set `config` to the empty descriptor value.
Here is an example of this with a single layer:
```json,title=Artifact%20without%20config&mediatype=application/vnd.oci.image.manifest.v1%2Bjson
{
"schemaVersion": 2,
"mediaType": "application/vnd.oci.image.manifest.v1+json",
"artifactType": "application/vnd.example+type",
"config": {
"mediaType": "application/vnd.oci.empty.v1+json",
"digest": "sha256:44136fa355b3678a1146ad16f7e8649e94fb4fc21fe77e8310c060f61caaff8a",
"size": 2
},
"layers": [
{
"mediaType": "application/vnd.example+type",
"digest": "sha256:e258d248fda94c63753607f7c4494ee0fcbe92f1a76bfdac795c9d84101eb317",
"size": 1234
}
]
}
```
3. For artifacts with a config blob, specify the `artifactType` to a common value for your artifact tooling, specify the `config` with the metadata for this artifact, and include the artifact in the `layers`.
Here is an example of this:
```json,title=Artifact%20with%20config&mediatype=application/vnd.oci.image.manifest.v1%2Bjson
{
"schemaVersion": 2,
"mediaType": "application/vnd.oci.image.manifest.v1+json",
"artifactType": "application/vnd.example+type",
"config": {
"mediaType": "application/vnd.example.config.v1+json",
"digest": "sha256:5891b5b522d5df086d0ff0b110fbd9d21bb4fc7163af34d08286a2e846f6be03",
"size": 123
},
"layers": [
{
"mediaType": "application/vnd.example.data.v1.tar+gzip",
"digest": "sha256:e258d248fda94c63753607f7c4494ee0fcbe92f1a76bfdac795c9d84101eb317",
"size": 1234
}
]
}
```
_Implementers note:_ artifacts have historically been created without an `artifactType` field, and tooling to work with artifacts should fallback to the `config.mediaType` value.
[iana]: https://www.iana.org/assignments/media-types/media-types.xhtml
[rfc6838]: https://tools.ietf.org/html/rfc6838
[rfc6838-s4.2]: https://tools.ietf.org/html/rfc6838#section-4.2

In all three examples, config.mediaType is either application/vnd.oci.empty.v1+json, or a custom (vendor-specific) type:

During pull, the mediaType of the config descriptor in the manifest is checked:
https://github.com/moby/moby/blob/605c8fb75dcfcd5d2d79dce5a58b330a6d5165dc/distribution/pull_v2.go#L423-L425
https://github.com/moby/moby/blob/605c8fb75dcfcd5d2d79dce5a58b330a6d5165dc/distribution/pull_v2.go#L495-L513

The default list of allowed mediaTypes is:
https://github.com/moby/moby/blob/a1d966c492c9f3941e1b6627f3f1128e76a27a2d/distribution/registry.go#L40-L51

In all cases, this should fail early and not result in attempting to use an artifact as a runnable image.

We can see this testing with neersighted/artifact-demo:

$ docker pull neersighted/artifact-demo:latest
latest: Pulling from neersighted/artifact-demo
unsupported media type application/vnd.oci.empty.v1+json

With the containerd storage backend, there is an IsPsuedoImage check that should cover all cases based on the fact that it will see a empty or non-image layer "layer":

https://github.com/moby/moby/blob/605c8fb75dcfcd5d2d79dce5a58b330a6d5165dc/daemon/containerd/image_manifest.go#L81-L107

However, this is not where things fail; instead we see:

$ docker pull neersighted/artifact-demo:latest
latest: Pulling from neersighted/artifact-demo
failed to unpack image on snapshotter overlayfs: no match for platform in manifest sha256:a1fa4097c5355d6ca1656dbcf41e1a40e0763d054ff098f07064666f74f41186: not found

Caused by:
https://github.com/containerd/containerd/blob/e62cacc4d6704038fbddce9bebd281155dcb967e/images/image.go#L236-L242
Which is in turn due to this matching logic:
https://github.com/containerd/containerd/blob/e62cacc4d6704038fbddce9bebd281155dcb967e/images/image.go#L170-L177

Which ultimately does not match, as a requested platform does not match a nil platform in image, even though a nil requested platform matches the current platform in the image.

I am not sure if IsPsuedoImage should be called to provide a better error message, or if failing on the containerd side is better; given that this functionality is a work-in-progress, I am not too concerned, however. If this is the correct place to centralize this logic, it should probably be expanded to check e.g. artifactType, or at least the config mediaType.

@neersighted
Copy link
Author

Discussed yesterday in the OCI call: No one had any strong objections to the concerns I've raised here, and further inconsistencies related to the artifact mediaType and the config.mediaType fallback were noted. I'll work on a PR this next week to address these concerns + those additional ones, erring in the direction of "refer to Artifacts as a proper noun, and better explain what Artifacts and Images are in a follow-up."

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants