Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cache restore race condition? #152

Closed
mpdude opened this issue Jun 9, 2021 · 5 comments
Closed

Cache restore race condition? #152

mpdude opened this issue Jun 9, 2021 · 5 comments

Comments

@mpdude
Copy link

mpdude commented Jun 9, 2021

Hey there 👋🏻 , thank you for this action! It makes my workflows (and thus) life much easier than having to deal with Composer and actions/cache myself.

However, I did a few test runs and noticed that I get failures every now and then, let's say in 1 out of 5 workflow runs. Those failures are transient and most of the time go away if I run the workflow again.

The bottom line error message is

2021-06-09T05:51:45.7317484Z ##[error]Failed to execute git clone --mirror -- 'ssh://git@github.com/MYORG/MYORG-SOME-PRIVATE-REPO' '/home/runner/.cache/composer/vcs/ssh---git-github.com-MYORG-MYORG-SOME-PRIVATE-REPO/'

Cloning into bare repository '/home/runner/.cache/composer/vcs/ssh---git-github.com-MYORG-MYORG-SOME-PRIVATE-REPO'...
BUG: refs/files-backend.c:2953: initial ref transaction called with existing refs
Aborted (core dumped)

Unfortunately, the Composer dependency in question is for a private repo that will be git-cloned as source, and I can't share all the details. But here is an excerpt from the raw logs with what I think is relevant.

2021-06-09T05:51:23.2825442Z ##[group]Operating System
2021-06-09T05:51:23.2827149Z Ubuntu
2021-06-09T05:51:23.2828278Z 18.04.5
2021-06-09T05:51:23.2829465Z LTS
2021-06-09T05:51:23.2830596Z ##[endgroup]
2021-06-09T05:51:23.2831828Z ##[group]Virtual Environment
2021-06-09T05:51:23.2833168Z Environment: ubuntu-18.04
2021-06-09T05:51:23.2834390Z Version: 20210531.0
2021-06-09T05:51:23.2837061Z Included Software: https://github.com/actions/virtual-environments/blob/ubuntu18/20210531.0/images/linux/Ubuntu1804-README.md
2021-06-09T05:51:23.2839459Z Image Release: https://github.com/actions/virtual-environments/releases/tag/ubuntu18%2F20210531.0
2021-06-09T05:51:23.2841153Z ##[endgroup]
...
2021-06-09T05:51:40.1782268Z ##[group]Run ramsey/composer-install@v1
2021-06-09T05:51:40.1783292Z with:
2021-06-09T05:51:40.1784197Z   dependency-versions: locked
2021-06-09T05:51:40.1785094Z env:
2021-06-09T05:51:40.1785850Z   APP_ENV: test
2021-06-09T05:51:40.1786845Z   PHP_VERSION: 7.2
2021-06-09T05:51:40.1787798Z   SSH_AUTH_SOCK: /tmp/ssh-MZQQ7o4xWr9F/agent.2587
2021-06-09T05:51:40.1789259Z   SSH_AGENT_PID: 2588
2021-06-09T05:51:40.1790655Z   TEMP: /home/runner/work/MYREPO/MYREPO/tmp
2021-06-09T05:51:40.1793996Z   COMPOSER_AUTH: ***
2021-06-09T05:51:40.1794829Z ##[endgroup]
2021-06-09T05:51:40.5953452Z Composer cache directory found at /home/runner/.cache/composer
2021-06-09T05:51:40.5977780Z Composer working directory set to project root.
2021-06-09T05:51:40.5981742Z Using locked versions of dependencies
2021-06-09T05:51:40.6680768Z PHP version is 7.2.34-21+ubuntu18.04.1+deb.sury.org+1
2021-06-09T05:51:40.6682462Z Operating system is linux
2021-06-09T05:51:40.6686362Z Cache primary key is linux-php-7.2.34-21+ubuntu18.04.1+deb.sury.org+1-locked-82ffafb70b7e4a023b19b2e8ef1eef08eb2e61b31dd27e40b751819a7f8331fe-
2021-06-09T05:51:40.6693769Z Cache restore keys are: linux-php-7.2.34-21+ubuntu18.04.1+deb.sury.org+1-locked-82ffafb70b7e4a023b19b2e8ef1eef08eb2e61b31dd27e40b751819a7f8331fe-, linux-php-7.2.34-21+ubuntu18.04.1+deb.sury.org+1-locked-
2021-06-09T05:51:40.7612805Z [command]/usr/local/bin/composer install --no-interaction --no-progress --ansi
2021-06-09T05:51:40.9632982Z  Installing dependencies from lock file (including require-dev) 
2021-06-09T05:51:40.9751962Z  Verifying lock file contents can be installed on current platform. 
2021-06-09T05:51:42.4208308Z Received 0 of 49209747 (0.0%), 0.0 MBs/sec
2021-06-09T05:51:42.6212086Z  Package operations: 186 installs, 0 updates, 0 removals 
2021-06-09T05:51:42.6217458Z   - Downloading  composer/package-versions-deprecated  ( 1.11.99.2 )
2021-06-09T05:51:42.6228500Z   - Downloading  symfony/polyfill-mbstring  ( v1.23.0 )
... a lot of "Downloading ..." dist files ...
2021-06-09T05:51:42.6460638Z   - Syncing  MYORG/SOME-PRIVATE-REPO  ( 3.1.0 ) into cache
2021-06-09T05:51:43.0227542Z Received 49209747 of 49209747 (100.0%), 29.3 MBs/sec
2021-06-09T05:51:43.0228762Z Cache Size: ~47 MB (49209747 B)
2021-06-09T05:51:43.0267378Z [command]/bin/tar --use-compress-program zstd -d -xf /home/runner/work/_temp/1c4b8074-9a05-4165-8e57-9b97c3c0e543/cache.tzst -P -C /home/runner/work/MYREPO/MYREPO
2021-06-09T05:51:43.1632026Z Cache restored successfully
2021-06-09T05:51:43.1746121Z Cache restored from key: linux-php-7.2.34-21+ubuntu18.04.1+deb.sury.org+1-locked-bac6c14ffff580a04a4fea1394e68aa862bb682755fbc9d2277682591500a408-
2021-06-09T05:51:45.7317484Z ##[error]Failed to execute git clone --mirror -- 'ssh://git@github.com/MYORG/MYORG-SOME-PRIVATE-REPO' '/home/runner/.cache/composer/vcs/ssh---git-github.com-MYORG-MYORG-SOME-PRIVATE-REPO/'

Cloning into bare repository '/home/runner/.cache/composer/vcs/ssh---git-github.com-MYORG-MYORG-SOME-PRIVATE-REPO'...
BUG: refs/files-backend.c:2953: initial ref transaction called with existing refs
Aborted (core dumped)

2021-06-09T05:51:45.7333789Z 
2021-06-09T05:51:45.7538921Z  [37;41m                                                                                                                                                                                            [39;49m
2021-06-09T05:51:45.7541852Z  [37;41m  [RuntimeException]                                                                                                                                                                        [39;49m
2021-06-09T05:51:45.7545887Z  [37;41m  Failed to execute git clone --mirror -- 'ssh://git@github.com/MYORG/MYORG-SOME-PRIVATE-REPO' '/home/runner/.cache/composer/vcs/ssh---git-github.com-MYORG-MYORG-SOME-PRIVATE-REPO/'   [39;49m
2021-06-09T05:51:45.7549415Z  [37;41m                                                                                                                                                                                            [39;49m
2021-06-09T05:51:45.7552153Z  [37;41m  Cloning into bare repository '/home/runner/.cache/composer/vcs/ssh---git-github.com-MYORG-MYORG-SOME-PRIVATE-REPO'...                                                                   [39;49m
2021-06-09T05:51:45.7556156Z  [37;41m  BUG: refs/files-backend.c:2953: initial ref transaction called with existing refs                                                                                                         [39;49m
2021-06-09T05:51:45.7558622Z  [37;41m  Aborted (core dumped)                                                                                                                                                                     [39;49m
2021-06-09T05:51:45.7561683Z  [37;41m                                                                                                                                                                                            [39;49m

What strikes me is the way how

2021-06-09T05:51:43.1632026Z Cache restored successfully
2021-06-09T05:51:43.1746121Z Cache restored from key: linux-php-7.2.34-21+ubuntu18.04.1+deb.sury.org+1-locked-bac6c14ffff580a04a4fea1394e68aa862bb682755fbc9d2277682591500a408-

happens just before the error, and how that is interleaved with the actual Composer run.

So, my suspicion is that some async operations go wrong here, so that the cache is restored while Composer is already running. Maybe that does not make much of a difference for dist files, but might be enough to make git stumble when it runs on a partly-restored repo?

I should also add that I have some concurrent workflows that run this action here at about the same time. But to my knowledge, different workflows run in isolated environments. So I would rule out any kind of cross-workflow race conditions.

@mpdude
Copy link
Author

mpdude commented Jun 9, 2021

I am sure this is the race condition as I initially described. Here is the log output of another run that worked:

PHP version is 7.2.34-21+ubuntu18.04.1+deb.sury.org+1
Operating system is linux
Cache primary key is linux-php-7.2.34-21+ubuntu18.04.1+deb.sury.org+1-locked-bac6c14ffff580a04a4fea1394e68aa862bb682755fbc9d2277682591500a408-
Cache restore keys are: linux-php-7.2.34-21+ubuntu18.04.1+deb.sury.org+1-locked-bac6c14ffff580a04a4fea1394e68aa862bb682755fbc9d2277682591500a408-, linux-php-7.2.34-21+ubuntu18.04.1+deb.sury.org+1-locked-
/usr/local/bin/composer install --no-interaction --no-progress --ansi
Installing dependencies from lock file (including require-dev)
Verifying lock file contents can be installed on current platform.
Received 49209747 of 49209747 (100.0%), 123.8 MBs/sec
Cache Size: ~47 MB (49209747 B)
/bin/tar --use-compress-program zstd -d -xf /home/runner/work/_temp/bba28022-b2f5-403f-9ceb-305964208faf/cache.tzst -P -C /home/runner/work/REPO/REPO
Cache restored successfully
Cache restored from key: linux-php-7.2.34-21+ubuntu18.04.1+deb.sury.org+1-locked-bac6c14ffff580a04a4fea1394e68aa862bb682755fbc9d2277682591500a408-
Package operations: 186 installs, 0 updates, 0 removals
  - Syncing webfactory/wfdynamic (3.1.0) into cache

You can see that /usr/local/bin/composer install --no-interaction --no-progress --ansi comes before Cache restored successfully, but this time, the cache was quick enough to be restored before Composer did any git clone operations.

I have no experience with TypeScript or asynchronous programming in JavaScript.

That being said, here is the code from https://github.com/actions/cache that you require() in src/cache/factory.ts:

https://github.com/actions/cache/blob/0638051e9af2c23d10bb70fa9beffcad6cff9ce3/src/restore.ts#L69-L71

The run() call seems to trigger the actual cache restore code; but since that is async, it might take a while to complete. Does run() return a promise in this case, but that promise is never exported from the module?

If that is the case, what does this code here

return async () => await require('cache/dist/restore.js')
wait for?

@ramsey
Copy link
Owner

ramsey commented Jun 9, 2021

I have no experience with TypeScript or asynchronous programming in JavaScript.

My experience is also limited, but I think calling attention to the fact that there's absolutely asynchronous operations going on here points to a likely smoking gun.

In the code you pointed out, I thought the await told it to wait, but that does seem in conflict with the async keyword. I'll check with some folks who know more about how JS works so I can get this to be more synchronous.

Thanks!

@mpdude
Copy link
Author

mpdude commented Jun 9, 2021

Awesome. IMHO the run() in the restore.js file from actions/Core looks suspicious, since it seems to return a promise that we never get hold of.

That might be very difficult to fix, since it would require tweaking upstream code.

Looking forward to hearing from you 👨🏻‍💻

@clxmstaab
Copy link

we are also running in spurious errors which look like race-conditions to me. any progress on this front?

ramsey added a commit that referenced this issue Dec 23, 2021
This also adds the following features:

- Use `--prefer-stable` with lowest dependencies (#178)
- Allow use of a custom cache key (#167)
- Allow ability to ignore the cache

Fixes: #187, #161, #152, #79
ramsey added a commit that referenced this issue Dec 23, 2021
This also adds the following features:

- Use `--prefer-stable` with lowest dependencies (#178)
- Allow use of a custom cache key (#167)
- Allow ability to ignore the cache

Fixes: #187, #161, #152, #79
ramsey added a commit that referenced this issue Dec 23, 2021
This also adds the following features:

- Use `--prefer-stable` with lowest dependencies (#178)
- Allow use of a custom cache key (#167)
- Allow ability to ignore the cache

Fixes: #187, #161, #152, #79
@ramsey
Copy link
Owner

ramsey commented Dec 23, 2021

This is fixed in v2, please upgrade using ramsey/composer-install@v2 or ramsey/composer-install@2.0.0. Details are available on the release announcement and the README.

Thanks!

@ramsey ramsey closed this as completed Dec 23, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants