Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Build fails on Mac M1 #412

Open
hadisinaee opened this issue Mar 13, 2024 · 10 comments
Open

Build fails on Mac M1 #412

hadisinaee opened this issue Mar 13, 2024 · 10 comments

Comments

@hadisinaee
Copy link

Hi,

I tried to build uBPF on Mac M1. I successfully installed all prerequisites for the project. However, when I want to build the project, I get this:

$ cmake -S . -B build -DUBPF_ENABLE_TESTS=true -DUBPF_ALTERNATE_LLVM_PATH=/opt/homebrew/opt/llvm/bin

CMake Warning at vm/CMakeLists.txt:19 (message):
  uBPF - using compat ELF support for macOS.

-- Expecting /path/to/ubpf/external/bpf_conformance/tests/lock_add.data test to fail.
-- Expecting /path/to/ubpf/external/bpf_conformance/tests/lock_add32.data test to fail.
-- Expecting /path/to/ubpf/external/bpf_conformance/tests/lock_and.data test to fail.
-- Expecting /path/to/ubpf/external/bpf_conformance/tests/lock_and32.data test to fail.
-- Expecting /path/to/ubpf/external/bpf_conformance/tests/lock_cmpxchg.data test to fail.
-- Expecting /path/to/ubpf/external/bpf_conformance/tests/lock_cmpxchg32.data test to fail.
-- Expecting /path/to/ubpf/external/bpf_conformance/tests/lock_fetch_add.data test to fail.
-- Expecting /path/to/ubpf/external/bpf_conformance/tests/lock_fetch_add32.data test to fail.
-- Expecting /path/to/ubpf/external/bpf_conformance/tests/lock_fetch_and.data test to fail.
-- Expecting /path/to/ubpf/external/bpf_conformance/tests/lock_fetch_and32.data test to fail.
-- Expecting /path/to/ubpf/external/bpf_conformance/tests/lock_fetch_or.data test to fail.
-- Expecting /path/to/ubpf/external/bpf_conformance/tests/lock_fetch_or32.data test to fail.
-- Expecting /path/to/ubpf/external/bpf_conformance/tests/lock_fetch_xor.data test to fail.
-- Expecting /path/to/ubpf/external/bpf_conformance/tests/lock_fetch_xor32.data test to fail.
-- Expecting /path/to/ubpf/external/bpf_conformance/tests/lock_or.data test to fail.
-- Expecting /path/to/ubpf/external/bpf_conformance/tests/lock_or32.data test to fail.
-- Expecting /path/to/ubpf/external/bpf_conformance/tests/lock_xchg.data test to fail.
-- Expecting /path/to/ubpf/external/bpf_conformance/tests/lock_xchg32.data test to fail.
-- Expecting /path/to/ubpf/external/bpf_conformance/tests/lock_xor.data test to fail.
-- Expecting /path/to/ubpf/external/bpf_conformance/tests/lock_xor32.data test to fail.

CMake Warning at bpf/CMakeLists.txt:69 (message):
  Clang supports BPF target

-- Building BPF map
-- Building BPF rel_64_32
-- Configuring done (0.3s)
-- Generating done (0.2s)
-- Build files have been written.

Here it is my llvm version:

$ brew info llvm 
llvm: stable 17.0.6, HEAD [keg-only]

I was wondering if I could build uBPF on Mac M1.

@viniciusd
Copy link

How are you reaching the conclusion the build failed? I don't see any error message

@hadisinaee
Copy link
Author

When I tried to run the tests, they failed because it seems there is a binary that is missing. I assume it wasn't built during the build; otherwise, it should pass the tests. I might be wrong, though.

This is part of the output log when I ran the tests:

$ cmake --build build --target test --

Could not find executable [path]ubpf/build/external/bpf_conformance/bin/bpf_conformance_runner
Looked in the following places:
[path]ubpf/build/external/bpf_conformance/bin/bpf_conformance_runner
[path]ubpf/build/external/bpf_conformance/bin/bpf_conformance_runner
[path]ubpf/build/external/bpf_conformance/bin/Release/bpf_conformance_runner
[path]ubpf/build/external/bpf_conformance/bin/Release/bpf_conformance_runner
[path]ubpf/build/external/bpf_conformance/bin/Debug/bpf_conformance_runner
[path]ubpf/build/external/bpf_conformance/bin/Debug/bpf_conformance_runner
[path]ubpf/build/external/bpf_conformance/bin/MinSizeRel/bpf_conformance_runner
[path]ubpf/build/external/bpf_conformance/bin/MinSizeRel/bpf_conformance_runner
[path]ubpf/build/external/bpf_conformance/bin/RelWithDebInfo/bpf_conformance_runner
[path]ubpf/build/external/bpf_conformance/bin/RelWithDebInfo/bpf_conformance_runner
[path]ubpf/build/external/bpf_conformance/bin/Deployment/bpf_conformance_runner
[path]ubpf/build/external/bpf_conformance/bin/Deployment/bpf_conformance_runner
[path]ubpf/build/external/bpf_conformance/bin/Development/bpf_conformance_runner
[path]ubpf/build/external/bpf_conformance/bin/Development/bpf_conformance_runner
[path]ubpf/build/external/bpf_conformance/bin/bpf_conformance_runner
[path]ubpf/build/external/bpf_conformance/bin/bpf_conformance_runner
[path]ubpf/build/external/bpf_conformance/bin/Release/bpf_conformance_runner
[path]ubpf/build/external/bpf_conformance/bin/Release/bpf_conformance_runner
[path]ubpf/build/external/bpf_conformance/bin/Debug/bpf_conformance_runner
	634 - [path]ubpf/tests/st.data-Interpreter (Not Run)
	635 - [path]ubpf/tests/stack.data-JIT (Not Run)
	636 - [path]ubpf/tests/stack.data-Interpreter (Not Run)
	637 - [path]ubpf/tests/stack2.data-JIT (Not Run)
	638 - [path]ubpf/tests/stack2.data-Interpreter (Not Run)
	639 - [path]ubpf/tests/stack3.data-JIT (Not Run)
	640 - [path]ubpf/tests/stack3.data-Interpreter (Not Run)
	641 - [path]ubpf/tests/stb.data-JIT (Not Run)
	642 - [path]ubpf/tests/stb.data-Interpreter (Not Run)
	643 - [path]ubpf/tests/stdw.data-JIT (Not Run)
	644 - [path]ubpf/tests/stdw.data-Interpreter (Not Run)
	645 - [path]ubpf/tests/sth.data-JIT (Not Run)
	646 - [path]ubpf/tests/sth.data-Interpreter (Not Run)
	647 - [path]ubpf/tests/string-stack.data-JIT (Not Run)
	648 - [path]ubpf/tests/string-stack.data-Interpreter (Not Run)
	649 - [path]ubpf/tests/stw.data-JIT (Not Run)
	650 - [path]ubpf/tests/stw.data-Interpreter (Not Run)
	651 - [path]ubpf/tests/stx.data-JIT (Not Run)
	652 - [path]ubpf/tests/stx.data-Interpreter (Not Run)
	653 - [path]ubpf/tests/stxb-all.data-JIT (Not Run)
	654 - [path]ubpf/tests/stxb-all.data-Interpreter (Not Run)
	655 - [path]ubpf/tests/stxb-all2.data-JIT (Not Run)
	656 - [path]ubpf/tests/stxb-all2.data-Interpreter (Not Run)
	657 - [path]ubpf/tests/stxb-chain.data-JIT (Not Run)
	658 - [path]ubpf/tests/stxb-chain.data-Interpreter (Not Run)
	659 - [path]ubpf/tests/stxb.data-JIT (Not Run)
	660 - [path]ubpf/tests/stxb.data-Interpreter (Not Run)
	661 - [path]ubpf/tests/stxdw.data-JIT (Not Run)
	662 - [path]ubpf/tests/stxdw.data-Interpreter (Not Run)
	663 - [path]ubpf/tests/stxh.data-JIT (Not Run)
	664 - [path]ubpf/tests/stxh.data-Interpreter (Not Run)
	665 - [path]ubpf/tests/stxw.data-JIT (Not Run)
	666 - [path]ubpf/tests/stxw.data-Interpreter (Not Run)
	667 - [path]ubpf/tests/subnet.data-JIT (Not Run)
	668 - [path]ubpf/tests/subnet.data-Interpreter (Not Run)
	669 - [path]ubpf/tests/unload_reload.data-JIT (Not Run)
	670 - [path]ubpf/tests/unload_reload.data-Interpreter (Not Run)
	671 - map_TEST_INTERPRET (Not Run)
	672 - map_TEST_JIT (Not Run)
	673 - rel_64_32_TEST_INTERPRET (Not Run)
	674 - rel_64_32_TEST_JIT (Not Run)
Errors while running CTest
Output from these tests are in: [path]ubpf/build/Testing/Temporary/LastTest.log
Use "--rerun-failed --output-on-failure" to re-run the failed cases verbosely.
make: *** [test] Error 8

@viniciusd
Copy link

Interesting. FYI the output of my cmake -S . is the same as yours, but my target test works just fine, i.e., all tests run

Is your brew x86 or arm? Maybe you wanna do -DUBPF_ALTERNATE_LLVM_PATH=$(brew --prefix llvm)/bin to make sure it is picking up the alternate llvm installed. Brew on x86 vs brew on ARM use different paths.

@hadisinaee
Copy link
Author

Interesting. FYI the output of my cmake -S . is the same as yours, but my target test works just fine, i.e., all tests run

That's interesting!

Is your brew x86 or arm?

It is arm64. If I'm not mistaken, all M series are arm-based.

Maybe you wanna do -DUBPF_ALTERNATE_LLVM_PATH=$(brew --prefix llvm)/bin to make sure it is picking up the alternate llvm installed.

It is the same output as before, and the tests are not passing. $(brew --prefix llvm)/bin gives the same llvm path I originally passed to cmake.

@viniciusd
Copy link

Is your brew x86 or arm?

It is arm64. If I'm not mistaken, all M series are arm-based.

Yep, but the M series supports x86 emulation using Rosetta. If you had a time machine backup that you used to migrate from the intel chip to the M chip, you would had got an x86 brew. Been there, done that.

I have just set up a dual-architecture brew, i.e., brew in x86 and arm so I can interchange them when compiling the bpf projects.

Can you try compiling a bpf using clang to make sure that is working?

$(brew --prefix llvm)/bin/clang -O2 -target bpf -c ./bpf/rel_64_32.bpf.c -o hello.o

This ./bpf/rel_64_32.bpf.c file is part of the ubpf repository. Does it work?

@hadisinaee
Copy link
Author

Yep, but the M series supports x86 emulation using Rosetta. If you had a time machine backup that you used to migrate from the intel chip to the M chip, you would had got an x86 brew. Been there, done that.

Aha, I see! No, I didn't migrate anything from the time machine.

I have just set up a dual-architecture brew, i.e., brew in x86 and arm so I can interchange them when compiling the bpf projects.

Thank you!

This ./bpf/rel_64_32.bpf.c file is part of the ubpf repository. Does it work?

Yes, it did! it compiled the rel_64_32.bpf.c program without any errors.

@viniciusd
Copy link

I just happened to make a mistake that might have been yours.

I was just going over the ubpf compilation again, and I did:

cmake -S . -B build -DUBPF_ENABLE_TESTS=true -DUBPF_ALTERNATE_LLVM_PATH=$(brew --prefix llvm)/bin
cmake --build build --target test --

And I got a very similar output to yours, where tests didn't run. The mistake is that I forgot to actually compile everything before running the tests:

cmake --build build --config Debug

This should be done before trying to run the tests, as they depend on the project being compiled first.

So the order should be:

cmake -S . -B build -DUBPF_ENABLE_TESTS=true -DUBPF_ALTERNATE_LLVM_PATH=$(brew --prefix llvm)/bin
cmake --build build --config Debug
cmake --build build --target test --

@hadisinaee
Copy link
Author

Ah, yes! That was actually the mistake I made! I changed the config to Debug and now the tests are passing except the last two ones! The output of my tests is the following:

673/674 Test #673: rel_64_32_TEST_INTERPRET ..........................***Failed  Required regular expression not found. Regex=[0xe[
]*[
]*$
]  0.00 sec
        Start 674: rel_64_32_TEST_JIT
674/674 Test #674: rel_64_32_TEST_JIT ..........................***Failed  Required regular expression not found. Regex=[0xe[
]*[
]*$
]  0.00 sec

99% tests passed, 2 tests failed out of 674

Total Test time (real) =   7.99 sec

The following tests FAILED:
	673 - rel_64_32_TEST_INTERPRET (Failed)
	674 - rel_64_32_TEST_JIT (Failed)

The tests output log for these two tests contains the following info:

673/674 Testing: rel_64_32_TEST_INTERPRET
673/674 Test: rel_64_32_TEST_INTERPRET
Command: "/path/to/project/ubpf/build/bin/ubpf_test" "--main-function" "main" "/path/to/project/ubpf/build/bpf/rel_64_32.bpf.o"
Directory: /path/to/project/ubpf/build/bpf
"rel_64_32_TEST_INTERPRET" start time: Mar 20 19:14 PDT
Output:
----------------------------------------------------------
uBPF error: number of nested functions calls (11) exceeds max (10) at PC 3
Warning: bad relocation type 10; skipping.
Warning: bad relocation type 10; skipping.
Warning: bad relocation type 10; skipping.
0xffffffffffffffff
<end of output>
Test time =   0.00 sec
----------------------------------------------------------
Test Fail Reason:
Required regular expression not found. Regex=[0xe[
]*[
]*$
]
"rel_64_32_TEST_INTERPRET" end time: Mar 20 19:14 PDT
"rel_64_32_TEST_INTERPRET" time elapsed: 00:00:00
----------------------------------------------------------

674/674 Testing: rel_64_32_TEST_JIT
674/674 Test: rel_64_32_TEST_JIT
Command: "/path/to/project/ubpf/build/bin/ubpf_test" "--main-function" "main" "/path/to/project/ubpf/build/bpf/rel_64_32.bpf.o"
Directory: /path/to/project/ubpf/build/bpf
"rel_64_32_TEST_JIT" start time: Mar 20 19:14 PDT
Output:
----------------------------------------------------------
uBPF error: number of nested functions calls (11) exceeds max (10) at PC 3
Warning: bad relocation type 10; skipping.
Warning: bad relocation type 10; skipping.
Warning: bad relocation type 10; skipping.
0xffffffffffffffff
<end of output>
Test time =   0.00 sec
----------------------------------------------------------
Test Fail Reason:
Required regular expression not found. Regex=[0xe[
]*[
]*$
]
"rel_64_32_TEST_JIT" end time: Mar 20 19:14 PDT
"rel_64_32_TEST_JIT" time elapsed: 00:00:00
----------------------------------------------------------

Have you ever gotten into such a situation before by any chance?

@viniciusd
Copy link

Yeah, test runs the tests, but it assumes binaries have been compiled before.

tl;dr

I am getting the same as you on the last 2 tests, I am ignoring that for now.

Longer thoughts

I have three guesses on why:

  • either it is some architectural difference that ends up getting a different result
  • or some other ubpf bug, as I find it particularly weird how it is reaching the maximum number of nested calls. If you look at rel_64_32.bpf.c, it has no reason to be doing more than 10 nested calls, but again this could be a problem with the compiler? Maybe manually inspecting the bpf bytecode. I tried writing a simple main function in c that returns an integer and that worked just fine, so it isn't like toolchain+ubpf is completely broken on Mac
  • or maybe this is the expected result anyway and the problem is just the regex engine and how it is different on Mac and on the Linux distributions used. Maybe, and only maybe, it is just a matter of tweaking the regex so it works on both systems.

Testing it on a Linux machine (preferably x86, but potentially arm64 as well?) would be ideal to know which path to follow if one wanted to debug it further. Will a Linux machine get the same 0xffffff result (is it an integer overflow?)? Maybe more importantly, would a linux machine also explode the limit of nested function calls?

I tried recompiling ubpf on my Mac with higher limits of nested function calls, making it to the hundreds, and it would still explode. So maybe there is an infinite loop somewhere?

I can't commit to investigating it in the short term, though; feel free to do so, or just ignore them for now as well

@hadisinaee
Copy link
Author

I see! That makes sense!

Thank you for your help! On Linux with x86 architecture, I don't have any issues building and passing these tests. But, I'll go and see if I can fix this issue on Mac.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants