Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Recent container change broke ThreadSanitizer builds #1241

Open
1 of 2 tasks
timwoj opened this issue Sep 21, 2023 · 11 comments
Open
1 of 2 tasks

Recent container change broke ThreadSanitizer builds #1241

timwoj opened this issue Sep 21, 2023 · 11 comments
Labels

Comments

@timwoj
Copy link

timwoj commented Sep 21, 2023

Expected Behavior

C++ builds using ThreadSanitizer should complete correctly.

Real Behavior

ThreadSanitizer reports the following error when trying to run any binary:

FATAL: ThreadSanitizer: unexpected memory mapping 0x5bb456972000-0x5bb456973000

Related Info

This is a (tick one of the following):

The log for the task above shows the configure script failing because it thinks that the OpenSSL headers and library differ. Manual investigation using terminal mode shows CMake failing for the reason above. This failure just started recently (in the last few weeks). It doesn't happen with docker containers started with the same Dockerfile on other systems. It's only happening to us on the Cirrus infra. It appears familiar to golang/go#59418, which was caused by a kernel issue (fixed in https://go-review.googlesource.com/c/build/+/482195).

@fkorotkov
Copy link
Contributor

Yeah, Cirrus CI is using Container-Optimized OS version 105 for the x86 and Arm containers. You can put experimental: true flag for the task that is failing. This way it will temporary run on the old infrastructure.

Let's see if the next version of Container-Optimized OS will fix the issues.

@timwoj
Copy link
Author

timwoj commented Sep 21, 2023

Yeah, Cirrus CI is using Container-Optimized OS version 105 for the x86 and Arm containers. You can put experimental: true flag for the task that is failing. This way it will temporary run on the old infrastructure.

Same result with the experimental tag. uname -a on that build says this:

Linux cirrus-ci-task-6181902449639424 5.15.120+ #1 SMP Fri Jul 21 03:39:30 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux

Is that correct?

Here's the task configuration:

tsan_sanitizer_task:
  experimental: true
  container:
    # Just uses a recent/common distro to run memory error/leak checks.
    dockerfile: ci/ubuntu-22.04/Dockerfile
    << : *SANITIZERS_RESOURCE_TEMPLATE

  << : *CI_TEMPLATE
  << : *SKIP_TASK_ON_PR
  env:
    ZEEK_CI_CONFIGURE_FLAGS: *TSAN_SANITIZER_CONFIG
    ZEEK_CI_DISABLE_SCRIPT_PROFILING: 1
    # If this is defined directly in the environment, configure fails to find
    # OpenSSL. Instead we define it with a different name and then give it
    # the correct name in the testing scripts.
    ZEEK_TSAN_OPTIONS: suppressions=/zeek/ci/tsan_suppressions.txt

I tried with the experimental tag in the container block too but that failed the same way. https://cirrus-ci.com/task/5715854843707392 has the last failure.

@fkorotkov
Copy link
Contributor

Could you please try privileged: true for your container instance then? This way a dedicated VM will be used for running your task. It will be a bit slower to schedule but you'll have an Ubuntu.

@timwoj
Copy link
Author

timwoj commented Sep 21, 2023

privileged: true in the outer task block and with experimental removed?

tsan_sanitizer_task:
  privileged: true
  container:
    # Just uses a recent/common distro to run memory error/leak checks.
    dockerfile: ci/ubuntu-22.04/Dockerfile
    << : *SANITIZERS_RESOURCE_TEMPLATE

  << : *CI_TEMPLATE
  << : *SKIP_TASK_ON_PR
  env:
    ZEEK_CI_CONFIGURE_FLAGS: *TSAN_SANITIZER_CONFIG
    ZEEK_CI_DISABLE_SCRIPT_PROFILING: 1
    # If this is defined directly in the environment, configure fails to find
    # OpenSSL. Instead we define it with a different name and then give it
    # the correct name in the testing scripts.
    ZEEK_TSAN_OPTIONS: suppressions=/zeek/ci/tsan_suppressions.txt

That gets me through the configure step, but the build fails for the same reason when it tries to run a binary as part of the build:

[ 15%] [BIFCL] Processing /zeek/auxil/zeek-af_packet-plugin/src/af_packet.bif
FATAL: ThreadSanitizer: unexpected memory mapping 0x5ad5e496e000-0x5ad5e4973000

https://cirrus-ci.com/task/5992372387971072?logs=build#L967

@timwoj
Copy link
Author

timwoj commented Sep 22, 2023

That gets me through the configure step, but the build fails for the same reason when it tries to run a binary as part of the build:

I re-ran the build this morning to double-check something and it failed during configure again.

@fkorotkov
Copy link
Contributor

If it's still fails with ThreadSanitizer than it might not be an issue with cos 105 version. I found another old report of a similar issue google/sanitizers#806 where the problem was in the old version of gcc.

If you have an x86 host with docker you might try to reproduce the issue using gcr.io/cirrus-ci-community/zeek/zeek/ci/ubuntu-2204/dockerfile:dae6979fc92dcba631e38ce7cf2335a7 container that is used in CI.

@timwoj
Copy link
Author

timwoj commented Sep 22, 2023

I found another old report of a similar issue google/sanitizers#806 where the problem was in the old version of gcc.

I've tried it with both gcc 11 (ubuntu 22) and 12 (ubuntu 23), so I don't think that's it.

If you have an x86 host with docker you might try to reproduce the issue using gcr.io/cirrus-ci-community/zeek/zeek/ci/ubuntu-2204/dockerfile:dae6979fc92dcba631e38ce7cf2335a7 container that is used in CI.

I'll see if I can scrounge up an old system to test it with.

@maflcko
Copy link
Contributor

maflcko commented Oct 3, 2023

We are also running into this. It should be trivial to reproduce with: echo 'void main(void){}' | gcc -pie -fPIE -fsanitize=thread -xc - -ltsan && ./a.out:

FATAL: ThreadSanitizer: unexpected memory mapping 0x56ce963d3000-0x56ce963d4000
Exit status: 66

See https://cirrus-ci.com/task/6173534590861312?logs=test#L2

Using gcc-13 from Ubuntu 23.10 (beta).

I understand that this is likely possible to fix by using a full GCE VM, but it would be nice if tsan in containers was supported again on Cirrus CI, like before.

@maflcko
Copy link
Contributor

maflcko commented Oct 3, 2023

I checked for google/sanitizers#877 (comment) but that didn't seem to be the cause here either.

@timwoj
Copy link
Author

timwoj commented Jan 8, 2024

I just wanted to check in and note that this is still broken.

@maflcko
Copy link
Contributor

maflcko commented Apr 24, 2024

As a temporary workaround, I think clang-18 from Ubuntu Noble 24.04 may work, instead of gcc.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants