New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix race condition #195
Fix race condition #195
Conversation
This is great! Thanks! I've put out a call on Twitter to ask folks who use this to test your changes. https://twitter.com/ramsey/status/1459190635479674884 |
It reads a lot better and seems like it works. The logic makes sense; however there is no test case for the await on nothing. No synthetic crash reproduction. So if it works, the breakage is not documented anywhere; just fixed. This will make it harder to re-factor and avoid regression. I think if https://stackoverflow.com/a/55263084/1548557 is correct, then await will first check if something has a Secondly, having a then Clause won't guarantee something returns a response. So you could also check the value of the awaited result, ensuring it is never null or undefined (perhaps using What do you think? |
@Lewiscowles1986 I agree, this could really use a test case for making sure the scenario is not added back in later.
I tried creating it when I wrote my PR, but simply couldn't get the mockery library to do as I wanted so I just said "fuck it".
Yes, I do think that would be a good idea. However linting should fail if what is given is not a promise. This is why I changed the typehints to be correct and require promises in the typehints. This way linting would complain when non-promises are used here. Do you think that is enough or where you envisioning something more?
Hmm, that might be possible, as the cache functions do have some return values. However we do now have typehints requiring them to return that type. so in theory linting should catch it ;) Thanks a lot for the feedback and trying to also do the research to figure out if this indeed is a solution |
Await on no test would look something like
TypeScript may be a complicating factor in this though. I've hit this a few times where I basically had to move a TypeScript project to JS because valid typescript with all it's supposed protections; prevented me from testing behavior that could be measured after transpiling to JS. Maybe someone else has a tip I'm unaware of if this doesn't work. Specifically that should fail for |
So I think we can safely say that the solution here is working. However @Lewiscowles1986 does have a valid point that we are missing test for it. However if I have to create the test it will take a while. |
This issue causes (only for me) at least once a day a failed build / the need to manually trigger "rebuild all". As there is no test yet that fails even though there is a problem for at least some users, I'd argue that fixing the issue is more important than waiting for a test case for code which wasn't covered before. There is another question that I'd like to raise: should the dist files be included in the merge (good for testing) or rather being rebuild after merged by the repo maintainer (the user base trusts this repo)? |
I agree, I also stumble on this multiple times per week on different repositories. |
@kevinpapst Agreed, I also agree that when/if this is being merged by @ramsey, he should rebuild it, so it's not my dist files 馃憤 |
@ramsey Would it be possible to merge this? 馃檹 |
@ramsey this is open source, it is your right to ignore our problems here. but I want to try it once again: could you please be so kind and share your plans about this topic. I am hitting that re-run all jobs literally every day and this last one was once too often 馃榿 I just want to know if we have a chance that this will be integrated and released anytime soon or if it is better that we are setting up our own forks of the action? |
@kevinpapst I've been waiting to see if @olivernybroe or anyone else would provide a test. You're welcome to run the action from your own fork. |
Thanks for your feedback @ramsey, and thanks for your action. It solves a problem for a certain target group in a neat way. For anyone interested, you can use the cache action combined with the speed improvements from Composer 2 to get the same build times (I could not see any speed difference to this action here). All you have to do is to replace:
with
|
@ramsey I've not been able to recreate a race in local tests. This is my key reason for not attempting a test. While I understand this might be frustrating, I think it's a wise choice to hold-off until we can make a reproductive case test, and then it can be confirmed why the flakes, and how they were fixed. |
GitHub now supports Since I'm now able to develop this as a composite action, I've shifted away from JavaScript and have converted this to a composite action in As a side note: when I first began developing this, GitHub's cache action also was not a separate library intended for use by other actions. They've since changed this, as well, so if I couldn't switch to a composite action, I would have updated to the latest version of their library, which might have fixed this race condition, among other issues. Now that I'm using GitHub's cache action directly as part of a composite action, I think any of the race conditions and inability to read cache files that folks were experiencing (due to my bad JavaScript code) should be cleared up. Please upgrade using Thanks for your PR and the good discussion here that encouraged me to make this change! |
@ramsey amazing! Looking forward to changing to version 2 馃帀 Thanks for the work to change this to a composite action! |
I think I might have found the reason for the race condition.
So looking at the following script https://github.com/actions/cache/blob/main/src/restore.ts it executes an promise, but doesn't return it.
We however require this script and treats it as if it were a promise, but the script doesn't give us a promise back. so we do an await on nothing basically.
To fix this, I chose to use the github action caching api directly, instead of executing the github caching actions script directly. This means we now get a Promise back and we actually have something we can wait on.
I reran this around 20 times on our repository to try and reproduce the error (we got this error like every 5 times) and it seems to be gone now.
I have put it live on our repository and will run with our branch for a week or so before taking this PR out of draft 馃憤
for people who wanna try out our solution and test it with us, you can use it the following way
resolves: #161
resolves: #152