Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mod initialization times out #575

Closed
ReenigneArcher opened this issue Feb 5, 2023 · 12 comments · Fixed by #577
Closed

mod initialization times out #575

ReenigneArcher opened this issue Feb 5, 2023 · 12 comments · Fixed by #577
Assignees
Labels
bug Something isn't working enhancement New feature or request

Comments

@ReenigneArcher
Copy link
Contributor

ReenigneArcher commented Feb 5, 2023

I have spent a lot of time diagnosing this issue, so I will try to be as detailed as possible.

I develop 3 docker mods. Two are plugins for Plex, and one is a plugin for Jellyfin.

In the past day I have one user for one of my Plex plugins, and one user for the Jellyfin plugin report that they cannot install the mod, and get an error like this.

2023-02-04 22:49:07 [mod-init] Attempting to run Docker Modification Logic
2023-02-04 22:49:07 [mod-init] Applying lizardbyte/themerr-plex:nightly files to container
2023-02-04 22:50:12 tar (child): /modtarball.tar.xz: Cannot open: No such file or directory
2023-02-04 22:50:12 tar (child): Error is not recoverable: exiting now
2023-02-04 22:50:12 tar: Child returned status 2
2023-02-04 22:50:12 tar: Error is not recoverable: exiting now
2023-02-04 22:50:12 cp: cannot stat '/tmp/mod/*': No such file or directory
2023-02-04 22:50:12 [mod-init] lizardbyte/themerr-plex:nightly applied to container
2023-02-04 22:50:12 [migrations] started
2023-02-04 22:50:12 [migrations] no migrations found

I'm also able to replicate the issue. So I dug in a bit using the image in the above logs. Here is what I found.

tag docker registry result ghcr registry result
v0.0.1
v0.0.2
v0.0.3
v0.0.4
v0.0.5
v0.0.6
v0.0.7
v0.0.8
v0.1.0
v0.1.1
v0.1.1
v0.1.2
v0.1.3

There are possibly two issues going on here, since every image on ghcr is failing. I can pull those images using docker pull ghcr.io/lizardbyte/themerr-plex:<tag>.

There was no change in the structure of the image I publish between v0.1.0 and v0.1.1. This is the commit where it starts to break on docker hub. LizardByte/Themerr-plex@7595dcc (note: the CI file modified here is not the one responsible for publishing to docker)

Additionally, the user reported that they could v0.0.7 and v0.0.8, but not v0.1.0 or anything after it. Which is slightly different than my findings.

Anyway, digging into the logic of the docker mod script, this is the responsible part of the code.

        echo "[mod-init] Applying ${DOCKER_MOD} files to container"
        # Get Dockerhub token for api operations
        TOKEN="$(
            curl -f --retry 10 --retry-max-time 60 --retry-connrefused \
                --silent \
                --header 'GET' \
                "${AUTH_URL}" |
                jq -r '.token'
        )"
        # Determine first and only layer of image
        SHALAYER=$(get_blob_sha "${MODE}" "${TOKEN}" "${MANIFEST_URL}")
        # Check if we have allready applied this layer
        if [[ -f "/${FILENAME}" ]] && [[ "${SHALAYER}" == "$(cat /"${FILENAME}")" ]]; then
            echo "[mod-init] ${DOCKER_MOD} at ${SHALAYER} has been previously applied skipping"
        else
            # Download and extract layer to /
            curl -f --retry 10 --retry-max-time 60 --retry-all-errors \
                --silent \
                --location \
                --request GET \
                --header "Authorization: Bearer ${TOKEN}" \
                "${BLOB_URL}${SHALAYER}" -o \
                /modtarball.tar.xz
            mkdir -p /tmp/mod
            tar xzf /modtarball.tar.xz -C /tmp/mod
            if [[ -d /tmp/mod/etc/s6-overlay ]]; then
                if [[ -d /tmp/mod/etc/cont-init.d ]]; then
                    rm -rf /tmp/mod/etc/cont-init.d
                fi
                if [[ -d /tmp/mod/etc/services.d ]]; then
                    rm -rf /tmp/mod/etc/services.d
                fi
            fi
            shopt -s dotglob
            cp -R /tmp/mod/* /
            shopt -u dotglob
            rm -rf /tmp/mod
            rm -rf /modtarball.tar.xz
            echo "${SHALAYER}" >"/${FILENAME}"
            echo "[mod-init] ${DOCKER_MOD} applied to container"
        fi

The second curl command seems to be failing, and there is no error handling of that command.

These issues also seem related, but don't go into much detail.

@thespad
Copy link
Member

thespad commented Feb 5, 2023

Short version: your mods are breaking for several different reasons.

Your GHCR mods contain multi-arch manifests, which is why they've never worked, as our mod logic assumes a single manifest record. Why only your GHCR images (until v0.1.1) container multi-arch manifests and not the DH ones I can't tell you.

Your DH and GHCR mods post v0.1.1 are using the application/vnd.oci.image.manifest.v1+json mediatype instead of application/vnd.docker.distribution.manifest.v2+json, and include attestation layers in the manifest, both of which are likely buildkit-related changes.

At the moment our mod logic doesn't set an Accept header for application/vnd.oci.image.manifest.v1+json so will never receive those manifests, we'll need to get that changed because it's going to become a common change going forward.

@ReenigneArcher
Copy link
Contributor Author

Thanks for the answer.

Although, the docker hub containers are multi arch as well. Images for both registries are built at the same time and pushed to each registry using docker's GitHub actions.

image

image

@ReenigneArcher
Copy link
Contributor Author

Thanks to your comment, found some info on the manifest here: docker/build-push-action#755

@thespad
Copy link
Member

thespad commented Feb 5, 2023

I'm going to dig into properly supporting more complex manifests but it's not going to be an overnight change - the logic is lot more complicated than what we currently do for mods and despite supposedly supporting the same API, DH and GHCR do things differently enough to need different flows for each.

@thespad thespad self-assigned this Feb 5, 2023
@thespad thespad added enhancement New feature or request bug Something isn't working labels Feb 5, 2023
@ReenigneArcher
Copy link
Contributor Author

I'd prefer not to change the provenance setting, so I'll follow this issue for updates.

One final question though. Does, or can the mod logic handle getting the correct architecture of the mod image based on the base image arch?

I don't currently have anything in the images that is arch specific, but don't know if that will change in the future when some dependency is added. I mostly just set it to multi arch to match the arches of the base image, although I can set just a single arch for now, if needed.

@thespad
Copy link
Member

thespad commented Feb 5, 2023

At the moment, no, but it's something I'll look at as part of this. It's extremely fiddly though because of how the manifests handle the arch, i.e.

      "platform": {
        "architecture": "amd64",
        "os": "linux"
      }

For amd64 but

      "platform": {
        "architecture": "arm",
        "os": "linux",
        "variant": "v7"
      }

For arm32 and then

      "platform": {
        "architecture": "unknown",
        "os": "unknown"
      }

For the attestation layers

@aptalca
Copy link
Member

aptalca commented Feb 5, 2023

Just to add that, our mod implementation is all about storing and retrieving a single tarball. (We use the registries as merely a storage platform).

Therefore it only supports a single image, single layer and thus a single tarball. No manifests or multi arch or attestation layers.

I'd highly recommend using our method of building and pushing mods instead of using 3rd party actions.
Here's the basic workflow: https://github.com/linuxserver/docker-mods/blob/template/.github/workflows/BuildImage.yml

And here's an example for a versioned push: https://github.com/linuxserver/docker-mods/blob/universal-cloudflared/.github/workflows/BuildImage.yml

And here's one with semver: https://github.com/linuxserver/docker-mods/blob/code-server-golang/.github/workflows/BuildImage.yml

They all involve a simple docker build, docker tag and docker push

@thespad thespad linked a pull request Feb 6, 2023 that will close this issue
@thespad
Copy link
Member

thespad commented Feb 7, 2023

Note that the above PR will take days/weeks to flow into downstream images so don't expect anything immediate.

It also doesn't address multi-arch manifests right now, it'll just grab the first manifest entry, which will usually be amd64 but isn't required to be.

@ReenigneArcher
Copy link
Contributor Author

Thanks! I'll periodically check the base images for updates. Much appreciated!

@ReenigneArcher
Copy link
Contributor Author

Note that the above PR will take days/weeks to flow into downstream images so don't expect anything immediate.

How do I know when the images have been updated with this change? I've seen commits to both the plex and jellyfin images but don't see any reference to mods in the commits. Not sure if I would even see any commit in those images or how your build process is set up.

@thespad
Copy link
Member

thespad commented Feb 19, 2023

Yeah it's because the changes are to the mods script which gets pulled into the base images, not the downstream ones. Simplest check is docker exec foo grep "MULTIDIGEST" /docker-mods, if that exists then it's using the updated version of the mods script.

Most images will have been rebuilt with the updated bases by now, but there will be some edge cases, especially for images with lots of intermediate bases like the rdesktop ones.

@aptalca
Copy link
Member

aptalca commented May 21, 2023

This should have waterfalled to all images by now. Closing.

@aptalca aptalca closed this as completed May 21, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants