-
Notifications
You must be signed in to change notification settings - Fork 186
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
"cacheFrom" value not considered when building dev container features #153
Comments
I am wondering if this line is invalidating the cache:
It's possible https://docs.docker.com/engine/reference/builder/#copy---link
|
Not sure Was the image you use as |
I quickly put together a workflow to use the CI action across two jobs to test the image caching that Chuck mentioned last night Workflow is here: https://github.com/stuartleeks/actions-playground/blob/main/.github/workflows/devcontainers-ci-cache-test.yml Run here: https://github.com/stuartleeks/actions-playground/runs/8129339831?check_suite_focus=true Notes:
|
@chrmarti This is the dev container in https://github.com/chuxel/feature-library. Here's the actions workflow run for it: https://github.com/Chuxel/feature-library/runs/8125187956?check_suite_focus=true What I am doing is testing out how quick just using "cacheFrom" is with a pre-built image (rather than recommending people have a separate devcontainer.json with an image reference in all cases). So it's the exact same devcontainer.json and Dockerfile contents in both cases from the same repo. Actions is using 0.14.1 - Codespaces is likely on an earlier version. However, I also see this in the latest VS Code Insiders build. However, note that I am using an OCI Artifact reference here.
@stuartleeks Are you also seeing the caching work when you open a dev container in Remote - Containers for VS Code Insiders locally with the cache for the feature install working? That's what I'm specifically seeing not work. I used "latest" for the tag. Also - to be crystal clear, the Dockerfile part of the build is cached, it's specifically the feature install that isn't. Also - what do you see if you use the docker-in-docker feature via ghcr.io specifically. That's the scenario here - it's possible this doesn't happen when you reference the feature the old way. |
Findings from my investigation using the terraform feature so far:
|
Got it: In @Chuxel's sample (where the user's Dockerfile does not copy files to the image) running the CLI locally with umask 0022 makes |
@chrmarti Yeah not sure though Windows would be done in WSL or a VM (ditto with mac), so it may work. The umask is a bit interesting. Linux defaults to 0022 actually - wonder why it is 0002. Maybe we can change that. If we can't change this on the host, the most recent version of buildkit does seem to support --chmod (in addition to --chown) in COPY: moby/moby#34819 |
Mac: Windows:
Linux:
I see 0002 on a Ubuntu 22.04 server I log in with ssh and a local Ubuntu 20.04 VM (on Parallels). In a Ubuntu 20.04 WSL distro I see 0022. (All have USERGROUPS_ENAB enabled, maybe WSL doesn't use PAM for login.) Potential fixes:
|
Alternatively we could put the tgz archive files in the context folder for features and untar them in the |
Looked into copying the archives to the image using |
I'm looking at the generated Dockerfile and build command, and wondering if an additional docker stage could be used as a workaround here? I did something similar in my own project to get around a bug where I'm thinking, instead of
Why not do
This won't stop the |
@davidwallacejackson I think the problem here is that the initial copy has different privs on Windows than anything else. So adding another stage still results in a cache miss. Doing the --chmod as a part of the copy should line up the cache so that doesn't happen, but it doesn't seem like that's working. @chrmarti Given that these copied files are temporary, I wonder if using --chmod 755 in all cases might do the trick. That would update the Linux (and macOS) privs to mirror Windows. I'd suspect that the problem here is specifically with NTFS filesystem translation since it doesn't directly map to unix style privs in the other two OS's. We end up deleting the contents, so there's no real reason it has to be 600 privs. Reversing things to have the other two OS's mirror Windows might be the key. |
@Chuxel I might be unclear on how caching works here, but I'd like to learn: I assumed that if two different sets of build steps led to the exact same state, they could then use the same cache thereafter. Or to put it another way, that a cached step could follow an uncached step, as long as the uncached step produced an output that had previously been produced by a cached step. Is that not the case? |
@davidwallacejackson Yeah, that should be happening here with the chmod, but is not. The input files to the COPY on windows have different privs than on Linux/macOS, so the caching may be assuming that the source files have changed (which invalidates the cache). Both the inputs and outputs and any dependent layers have to be the same before caching kicks in. The fact this does not happen on macOS or Linux seems to indicate that priv differences are causing the issue... hence the bug to moby. It comes down to what caching considers a "change" for a copied file. |
Got it. But if it is possible for different builds to "merge" on a shared state, shouldn't the intermediate step solve that problem? The way I envision it is (apologies for the ASCII -- I really need to learn Mermaid...): Current
After adding intermediate stage
So the initial steps -- the COPY from the host filesystem context and the RUN to normalize the permissions -- would potentially be uncached because the host filesystem might mangle permissions. But that step is trivial, since the Features source should be really small, and the output from the intermediate stage should be the same on every platform, even if its inputs were different. So that would mean that once you hit the first step of the actual build stage, all the inputs are exactly the same, right? Or am I still missing something? |
Update: I was curious about this, so I ran a test -- and it looks like I was right. This proves that adding another stage to normalize permissions would allow the work of actually installing the features to be cached, right? |
@davidwallacejackson It sounds like it! @chrmarti ? |
Made a PR with this change and tested it manually -- looks like this does, in fact, solve the problem! |
@davidwallacejackson Sorry for the delay! @chrmarti and I have been out-of-office, but @jkeech mentioned @edgonmsft, @joshspicer, or @samruddhikhandale could look at #233 @stuartleeks took a quick look as well and it looked good. |
This shipped for the CLI (0.23.2) and also VS Code Dev Containers yesterday. The Codespaces team will pick up the latest update soon as well. If you are still hitting problems on Windows, I noticed that docker/for-win#12995 seems to be breaking caching all-up. The problem seems to be fetching metadata and timing out. Removing "credsStore": "desktop.exe" from ~/.docker/config.json makes the problem go away, though that doesn't work well if you are using a private image. |
The
build.cacheFrom
value allows you to publish an image, and then use it as a local cache for subsequent builds on another machine (as does the Docker Compose equivalent). Currently contents of the image incacheFrom
do not appear to be being used during the dev container features build step which reduces the utility of it pretty significantly.Repro:
Expected: Since there is a pre-built image with the "docker-in-docker" feature inside it (see https://github.com/Chuxel/feature-library/blob/main/.github/workflows/devcontainer-image.yaml), the feature docker build reuses the cached result.
Actual: The image contents are only used in during the core Dockerfile build not during the feature build
Log section illustrating the cache being used in one case, but not the other.
The text was updated successfully, but these errors were encountered: