New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update s390x dockerfile #1716
Update s390x dockerfile #1716
Conversation
I got actions-runner updated and working (tested with actual CI jobs), but then managed to break qemu-user-static. I compiled my own (only took 6 hours to figure out how), but actions-runner now fails when attempting to run CI jobs. |
It is kind of weird, because it seems like actions-runner sometimes succeeds at running the whole compile and many tests, but fails to report that something in the tests fails (and with it the relevant part of the test logs). The actions-runner logs show that the job failed and everything looks ok, but github shows that it is timing out waiting for data. The Might try to downgrade actions-runner to a previous version tomorrow to see whether that fixes the problem. |
Just tested downgrading actions-runner to 2.315.0, still crashes. I wish actions-runner supported s390x natively, unfortunately the PR to add that to actions-runner seems to have stalled 1.5 years ago.
|
Found an Actions Runner that should work on System Z here: Did not test it yet, as it'd likely take me several hours to figure out how to convert our current setup to use this instead, but it is a possibility. |
eBPF CI has a similar setup as zlib-ng (https://github.com/libbpf/ci), and they migrated to aptman/qus:d7.1 (https://github.com/libbpf/ci/blob/5618ba5cc00e40916ecddccbd5f885b29b83b68e/ansible/roles/qemu-user-static/tasks/main.yml, https://github.com/libbpf/ci/blob/5618ba5cc00e40916ecddccbd5f885b29b83b68e/ansible/roles/qemu-user-static/defaults/main.yml). I suggest to give it a try - it should be a drop-in replacement for the current image (https://github.com/dbhi/qus?tab=readme-ov-file#setup). gaplib is also an option, but I didn't have a change to experiment with it yet. |
621bcfb
to
8a9ea6f
Compare
I got gaplib working, so actions-runner is running natively instead of depending on qemu. Just needs some more iterations for fixing the last of the compile errors (missing deps). |
fae1e53
to
40398d0
Compare
@iii-i Things are working pretty well now. Could you please do a review? The only thing I could not figure out was how to use a volume for keeping data between runs. That means actions runner currently fails to unregister (since it has lost the identity files). When it "registers" again, it just gets the same runner-id as previously, so at least we don't end up with thousands of "runners" in the github system. BTW, I noticed a few other things that we might want to improve at some point, so if anyone wants to tackle these, feel free:
|
I notice the conditionals for workflow jobs don't work in a matrix. Will try to fix that later, found a slightly ugly solution that supposedly works. What I am wondering is whether the |
+ <MSBuild Targets="Publish" Projects="@(ProjectFiles)" BuildInParallel="false" StopOnFirstFailure="true" Properties="Configuration=$(BUILDCONFIG);PackageRuntime=$(PackageRuntime);Version=$(RunnerVersion);$(PublishRuntimeIdentifier);PublishDir=$(MSBuildProjectDirectory)/../_layout/bin" /> | ||
<Exec Command="%22$(DesktopMSBuild)%22 Runner.Service/Windows/RunnerService.csproj /p:Configuration=$(BUILDCONFIG) /p:PackageRuntime=$(PackageRuntime) /p:OutputPath=%22$(MSBuildProjectDirectory)/../_layout/bin%22" ConsoleToMSBuild="true" Condition="'$(PackageRuntime)' == 'win-x64' Or '$(PackageRuntime)' == 'win-x86' Or '$(PackageRuntime)' == 'win-arm64'" /> | ||
</Target> | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding empty line at end of file here... Some file formats don't like empty lines at end of a file. Lint warns about having more than one empty line at end of file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I noticed, but it is not really something I can do anything about without manually editing the patch file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The easiest solution would be to disable linting for patch files... That would also mean that patch files would need to be manually reviewed for any style errors that we actually care about.
Real solution would be that Lint would need to ignore any line not starting with `+ ´ (plus sign followed by single ASCII 32 aka SPACE character).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We might want to disable Lint CI run for *.patch files as those can have white-space at end of a line.
I agree, ideally the linter would have understood patch file, but ignoring them would work. |
Because patch files use white-space to separate |
I tested it, and no, it does not work either. And I had misread that link, that is actually two suggested syntaxes for the proposed possibility to be able to filter in a matrix. Can't find any information about whether it made it in or not, so probably not. I am also tempted to look at utilizing reusable workflows to allow us to have steps defined one place and included into other files, so we don't have to duplicate things. But that is way too much to do in this PR, so I think I'll have to back out some of the changes and hopefully get them properly implemented in another PR. |
0d1f1d4
to
bc2b9e5
Compare
Hi, thanks for handling the update - the PR looks good to me. One nit: as far as I can see it doesn't use gaplib directly, so perhaps instead of saying "based on gaplib" say "inspired by gaplib" or something along these lines? Regarding the volume, I think it's better to drop it altogether. Ideally there should be 0 persistence across runs, in order to limit the impact of malicious PRs. |
Arm, | ||
- Arm64 | ||
+ Arm64, | ||
+ S390x |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Incorrect indentation, mix of spaces and tabs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the patch file we copy from gaplib, so such changes to it should instead be suggested upstream.
So I can't really tell, but does actions-runner now use docker for s390x? |
actions-runner is running in a docker, yes. The host is s390x, and everything now runs native s390x. Previously actions-runner was also running in a docker, and the host was s390x, but the actions-runner application itself only supported x86_64 so it had to run with qemu emulation, while the builds it started ran natively on s390x.. It was a special kind of confusing. |
Rebased |
@iii-i I notice that after #1717 was merged, the s390x MSAN test has started failing. (It previously failed to run at all, this PR fixes that, but it now fails on
|
You could possibly add this to - name: Ignore files
shell: bash
run: echo "*.patch" >> .gitignore I suggest doing it in this PR, so we can see if it works. |
My bad, I tested the MSan build of #1717 without gtest. Would it be reasonable to adjust the test as follows?
The zlib manual does not say anything about this, but should |
- New dockerfile - Using native actions-runner instead of relying on qemu. - To support s390x, we include patches to actions-runner. - Using Almalinux 9 instead of Ubuntu, with functional .Net. - Update CI workflow. - Update readme guide.
Disable unneccessary compilation of tests, benchmarks, docs.
This didn't work, not sure what went wrong there. Edit: Well, no wonder that didn't help, |
I'm not sure about this, but if an application passes us a buffer and says the length is X, then we should be fine to access the whole thing. So I guess either the bug here is that the buffer is not fully initialized, or that it is specified to be too big. Perhaps initialize it so we test the case of garbage remainder? |
whitespace or end the file with blank lines.
Lint fixed. |
Interesting solution. |
s390x VM has been upgraded from RHEL 8 to Rocky Linux 9.3.
Rocky 9 comes with Clang-16 and GCC-12 by default, among other significant updates.
This PR updates the podman image and removes the need for running actions-runner using qemu x86-64 emulation.
zlib-ng/zlib-ng
.