Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make run-task and docker-image hashes optional in cache names #505

Open
ahal opened this issue May 13, 2024 · 0 comments
Open

Make run-task and docker-image hashes optional in cache names #505

ahal opened this issue May 13, 2024 · 0 comments

Comments

@ahal
Copy link
Collaborator

ahal commented May 13, 2024

Currently Taskgraph adds both the hash of run-task and the docker-image tasks to cache names (if those things are being used):

suffix = f"{cache_version}-{_run_task_suffix()}"

This ensures correctness, it almost guarantees that we won't get errors due to different versions of tools being used across the same set of files. However, it comes at the cost of more cache misses!

For example, in Gecko we typically have a ton of tasks coming in for any given docker-image. Furthermore, pools tend to only run tasks with certain images, so this feature makes a lot of sense.

On the other hand, mozilla-vpn-client has only a single pool that runs a wide array of tasks with docker-images. Further, pushes come in infrequently so workers aren't very long lived. This means we almost never have cache hits.

Another point is the type of cache. Checkout caches tend to be more susceptible (especially with Mercurial) to this, but something like a dotfile cache might not be (maybe?). The point is different kinds of caches have different levels of risk for this.

I propose that instead of automatically adding the run-task and docker-image hashes to all cache names, we use them as values that can be interpolated into the cache name. I.e, a cache name could be checkouts-{run_task}-{docker_image} and these values would be included in the hash name. Or it could just be checkouts and then they wouldn't. This allows individual projects, and even individual caches within a project, to set up cache names however is best for that context.

There's definitely an open question around whether one or both of these hashes should be included by default. Also how hard we should try to preserve backwards compatibility.

@ahal ahal changed the title Support customization of cache names Make run-task and docker-image hashes optional in cache names May 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant