New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug]: Parallel test process execution causing race condition and empty babel transpiled file read from disk cache #30577
Comments
…nspiled files from disk cache First add a maker file to check if the previous write was fully completed. Then again in writefile section change the write mode to wx which uses exclusive lock and if two writes happen same time and we get EEXist error then ignore that FIX for microsoft#30577
…nspiled files from disk cache First add a maker file to check if the previous write was fully completed. Then again in writefile section change the write mode to wx which uses exclusive lock and if two writes happen same time and we get EEXist error then ignore that FIX for microsoft#30577
Created a PR to fix it. We were able to patch playwright with this PR changes and verify it worked fine then. |
Bit more detailed explanation. The current main code block look like this
Not if 1st process makes call to write codeconent and 2nd process make call to read codecontent at same time the 2nd process gets and empty file because the writefilesync first creates empty file and them writes content and then closes filedescriptor To fix this the code is changed to
Now the process will try to read file only if write by some other process has been succesfuly completed. We had seen similar issue in past also and it was fixed earlier by using marker file check. But it was then still causing EExist errors in very rare case. |
+1 |
@sunilsurana Thank you for the issue! First of all, I need to clarify your scenario. Are you running multiple However, if you do run multiple As you correctly identified, we have already tried the marker-file approach, and reverted it because of the added complexity and bugs still being reported by users. Therefore, we are not really optimistic that proposed PR will solve the issue without introducing new bugs, and need a very strong reason to move into this direction again. |
Hi @dgozman , I have provided a repro repo in the issue description. with the steps to execute. |
The current workaround is to pass a unique I'll close this issue for now, but we can reopen if there is more demand for this. |
Version
1.42.1
Steps to reproduce
It can be reproduced on Codespace or Azure VM's. Created this repo with steps for reproing issue. https://github.com/sunilsurana/reprofilesync
This repo mimics the behaviour of compilationcache.ts logic if it were to execute in parallel processes
Note that the issue may or may not repro on local laptop depending on disk and OS. It repros on codespace and azure vm
Expected behaviour
The tests should work deterministically fine when multiple processes trigger tests at same time
Actual behaviour
The test throws error because it could not find an import. It comes null because of empty file read when tests invoked thru multiple processes in parallel
Additional context
We distribute our tests across multiple workers and processes. So on a single machine if multiple process start test at same time the disk cached code reading part reads empty file because another process just began writing to it.
This happens because the fs.writeFileSync create 3 OS level calls which we can see using strace
open("file.txt", O_WRONLY|O_CREAT|O_TRUNC|O_CLOEXEC, 0666) = 9
pwrite(9, "content\n", 8, 0) = 8
close(9) = 0
The first open sys call truncates file if it exist and creates a blank file. Now at same time if reader comes from another process the reader will get file exists and when reading file it get blank file.
We were able to verify this by putting logs in playwright where the read and write were happening at exact millisecond precision and causing test failures.
Environment
The text was updated successfully, but these errors were encountered: