Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable unaligned memory access on big endian PowerPC #1690

Open
wants to merge 2 commits into
base: develop
Choose a base branch
from

Conversation

ccawley2011
Copy link
Contributor

See also PR #1469. This will need testing on actual hardware, but it should work fine.

zbuild.h Outdated Show resolved Hide resolved
Copy link

codecov bot commented Feb 21, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 32.59%. Comparing base (ca0e463) to head (2a72924).
Report is 16 commits behind head on develop.

Additional details and impacted files
@@             Coverage Diff              @@
##           develop    #1690       +/-   ##
============================================
- Coverage    83.01%   32.59%   -50.43%     
============================================
  Files          135       66       -69     
  Lines        10365     5513     -4852     
  Branches      2815     1247     -1568     
============================================
- Hits          8605     1797     -6808     
- Misses        1067     3447     +2380     
+ Partials       693      269      -424     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@mtl1979
Copy link
Collaborator

mtl1979 commented Feb 22, 2024

I've tested unaligned access "emulation" in qemu and it doesn't cause exception like on real hardware, but it will cause data corruption which our tests should detect.

As far as I know, unaligned memory access in some PowerPC processors is handled transparently in kernel (similar to handling in SBI "firmware" for RISC-V), so user applications and libraries don't even see the exception except as performance loss.

@ccawley2011
Copy link
Contributor Author

I've tested unaligned access "emulation" in qemu and it doesn't cause exception like on real hardware, but it will cause data corruption which our tests should detect.

As far as I know, unaligned memory access in some PowerPC processors is handled transparently in kernel (similar to handling in SBI "firmware" for RISC-V), so user applications and libraries don't even see the exception except as performance loss.

In that case, should I limit these changes to 64-bit PowerPC for now, since unaligned access is already enabled there in little endian builds?

@Dead2 Dead2 added optimization Needs testing Please help test this labels Feb 24, 2024
@mtl1979
Copy link
Collaborator

mtl1979 commented Feb 24, 2024

I've tested unaligned access "emulation" in qemu and it doesn't cause exception like on real hardware, but it will cause data corruption which our tests should detect.
As far as I know, unaligned memory access in some PowerPC processors is handled transparently in kernel (similar to handling in SBI "firmware" for RISC-V), so user applications and libraries don't even see the exception except as performance loss.

In that case, should I limit these changes to 64-bit PowerPC for now, since unaligned access is already enabled there in little endian builds?

I don't have 64-bit PowerPC hardware to test, so I can't say what is the best solution... We should benchmark if there is any speed gain in enabling unaligned access, as usually switching between user and kernel code has small performance penalty.

@KungFuJesus
Copy link
Contributor

I have a g5, I can try it out tomorrow.

@KungFuJesus
Copy link
Contributor

KungFuJesus commented Feb 25, 2024

Code doesn't currently compile - it doesn't seem to like the ALIGNED macro:

In file included from /home/adam/zlib-ng-unaligned/tools/makefixed.c:5:
/home/adam/zlib-ng-unaligned/inflate.h:106:34: error: expected declaration specifiers or ‘...’ before numeric constant
  106 |     struct crc32_fold_s ALIGNED_(16) crc_fold;
      |                                  ^~
make[2]: *** [test/CMakeFiles/makefixed.dir/build.make:76: test/CMakeFiles/makefixed.dir/__/tools/makefixed.c.o] Error 1
make[1]: *** [CMakeFiles/Makefile2:300: test/CMakeFiles/makefixed.dir/all] Error 2
make[1]: *** Waiting for unfinished jobs....
In file included from /home/adam/zlib-ng-unaligned/functable.h:9,
                 from /home/adam/zlib-ng-unaligned/arch/generic/adler32_c.c:7:
/home/adam/zlib-ng-unaligned/deflate.h:120:17: error: expected identifier or ‘(’ before numeric constant
  120 | struct ALIGNED_(16) internal_state {
      |                 ^~

I think the solution is to add zbuild.h to the inflate and deflate headers, though I'm a little bit confused why this struct isn't throwing off other architectures.

Nope, that doesn't fix it either. What's even weirder is this macro is used explicitly in the VMX code. Your diff from the develop branch is just messing with the unaligned_ok macro, what on earth is causing this?

@KungFuJesus
Copy link
Contributor

KungFuJesus commented Feb 25, 2024

Is CMake just suddenly failing the test for whether or not that compiler builtin exists and is now not defining this macro? That has to be it, somehow.

Wow, yeah, somehow that's it. Adding -DHAVE_ATTRIBUTE_ALIGNED to the cflags manually allows it to compile. That's really weird, the cmakelist is somehow bypassing this test.

@KungFuJesus
Copy link
Contributor

Anyhow, assuming that cmake issue gets fixed:

adam@g5box ~/zlib-ng-unaligned/build $ make test
Running tests...
Test project /home/adam/zlib-ng-unaligned/build
      Start  1: example
 1/70 Test  #1: example ..........................   Passed    0.01 sec
      Start  2: infcover
 2/70 Test  #2: infcover .........................   Passed    0.01 sec
      Start  3: CVE-2002-0059
 3/70 Test  #3: CVE-2002-0059 ....................   Passed    0.03 sec
      Start  4: CVE-2004-0797
 4/70 Test  #4: CVE-2004-0797 ....................   Passed    0.02 sec
      Start  5: CVE-2005-1849
 5/70 Test  #5: CVE-2005-1849 ....................   Passed    0.02 sec
      Start  6: CVE-2005-2096
 6/70 Test  #6: CVE-2005-2096 ....................   Passed    0.03 sec
      Start  7: CVE-2018-25032-fixed-level-6
 7/70 Test  #7: CVE-2018-25032-fixed-level-6 .....   Passed    0.10 sec
      Start  8: CVE-2018-25032-default-level-6
 8/70 Test  #8: CVE-2018-25032-default-level-6 ...   Passed    0.10 sec
      Start  9: CVE-2018-25032-fixed-level-1
 9/70 Test  #9: CVE-2018-25032-fixed-level-1 .....   Passed    0.10 sec
      Start 10: CVE-2018-25032-default-level-1
10/70 Test #10: CVE-2018-25032-default-level-1 ...   Passed    0.10 sec
      Start 11: CVE-2018-25032-fixed-level-2
11/70 Test #11: CVE-2018-25032-fixed-level-2 .....   Passed    0.10 sec
      Start 12: CVE-2018-25032-default-level-2
12/70 Test #12: CVE-2018-25032-default-level-2 ...   Passed    0.10 sec
      Start 13: minigzip-fireworks.jpg-R
13/70 Test #13: minigzip-fireworks.jpg-R .........   Passed    0.22 sec
      Start 14: minigzip-fireworks.jpg-h
14/70 Test #14: minigzip-fireworks.jpg-h .........   Passed    0.22 sec
      Start 15: minigzip-fireworks.jpg-T
15/70 Test #15: minigzip-fireworks.jpg-T .........   Passed    0.10 sec
      Start 16: minigzip-fireworks.jpg-0
16/70 Test #16: minigzip-fireworks.jpg-0 .........   Passed    0.22 sec
      Start 17: minigzip-fireworks.jpg-1
17/70 Test #17: minigzip-fireworks.jpg-1 .........   Passed    0.23 sec
      Start 18: minigzip-fireworks.jpg-2
18/70 Test #18: minigzip-fireworks.jpg-2 .........   Passed    0.23 sec
      Start 19: minigzip-fireworks.jpg-4
19/70 Test #19: minigzip-fireworks.jpg-4 .........   Passed    0.23 sec
      Start 20: minigzip-fireworks.jpg-5
20/70 Test #20: minigzip-fireworks.jpg-5 .........   Passed    0.23 sec
      Start 21: minigzip-fireworks.jpg-F
21/70 Test #21: minigzip-fireworks.jpg-F .........   Passed    0.23 sec
      Start 22: minigzip-fireworks.jpg-6
22/70 Test #22: minigzip-fireworks.jpg-6 .........   Passed    0.23 sec
      Start 23: minigzip-fireworks.jpg-9
23/70 Test #23: minigzip-fireworks.jpg-9 .........   Passed    0.23 sec
      Start 24: minigzip-fireworks.jpg-f
24/70 Test #24: minigzip-fireworks.jpg-f .........   Passed    0.23 sec
      Start 25: minigzip-lcet10.txt-R
25/70 Test #25: minigzip-lcet10.txt-R ............   Passed    0.29 sec
      Start 26: minigzip-lcet10.txt-h
26/70 Test #26: minigzip-lcet10.txt-h ............   Passed    0.28 sec
      Start 27: minigzip-lcet10.txt-T
27/70 Test #27: minigzip-lcet10.txt-T ............   Passed    0.10 sec
      Start 28: minigzip-lcet10.txt-0
28/70 Test #28: minigzip-lcet10.txt-0 ............   Passed    0.27 sec
      Start 29: minigzip-lcet10.txt-1
29/70 Test #29: minigzip-lcet10.txt-1 ............   Passed    0.28 sec
      Start 30: minigzip-lcet10.txt-2
30/70 Test #30: minigzip-lcet10.txt-2 ............   Passed    0.29 sec
      Start 31: minigzip-lcet10.txt-4
31/70 Test #31: minigzip-lcet10.txt-4 ............   Passed    0.29 sec
      Start 32: minigzip-lcet10.txt-5
32/70 Test #32: minigzip-lcet10.txt-5 ............   Passed    0.29 sec
      Start 33: minigzip-lcet10.txt-F
33/70 Test #33: minigzip-lcet10.txt-F ............   Passed    0.30 sec
      Start 34: minigzip-lcet10.txt-6
34/70 Test #34: minigzip-lcet10.txt-6 ............   Passed    0.30 sec
      Start 35: minigzip-lcet10.txt-9
35/70 Test #35: minigzip-lcet10.txt-9 ............   Passed    0.32 sec
      Start 36: minigzip-lcet10.txt-f
36/70 Test #36: minigzip-lcet10.txt-f ............   Passed    0.29 sec
      Start 37: minigzip-paper-100k.pdf-R
37/70 Test #37: minigzip-paper-100k.pdf-R ........   Passed    0.22 sec
      Start 38: minigzip-paper-100k.pdf-h
38/70 Test #38: minigzip-paper-100k.pdf-h ........   Passed    0.22 sec
      Start 39: minigzip-paper-100k.pdf-T
39/70 Test #39: minigzip-paper-100k.pdf-T ........   Passed    0.09 sec
      Start 40: minigzip-paper-100k.pdf-0
40/70 Test #40: minigzip-paper-100k.pdf-0 ........   Passed    0.22 sec
      Start 41: minigzip-paper-100k.pdf-1
41/70 Test #41: minigzip-paper-100k.pdf-1 ........   Passed    0.22 sec
      Start 42: minigzip-paper-100k.pdf-2
42/70 Test #42: minigzip-paper-100k.pdf-2 ........   Passed    0.22 sec
      Start 43: minigzip-paper-100k.pdf-4
43/70 Test #43: minigzip-paper-100k.pdf-4 ........   Passed    0.22 sec
      Start 44: minigzip-paper-100k.pdf-5
44/70 Test #44: minigzip-paper-100k.pdf-5 ........   Passed    0.22 sec
      Start 45: minigzip-paper-100k.pdf-F
45/70 Test #45: minigzip-paper-100k.pdf-F ........   Passed    0.22 sec
      Start 46: minigzip-paper-100k.pdf-6
46/70 Test #46: minigzip-paper-100k.pdf-6 ........   Passed    0.22 sec
      Start 47: minigzip-paper-100k.pdf-9
47/70 Test #47: minigzip-paper-100k.pdf-9 ........   Passed    0.22 sec
      Start 48: minigzip-paper-100k.pdf-f
48/70 Test #48: minigzip-paper-100k.pdf-f ........   Passed    0.22 sec
      Start 49: minigzip-detect-text-A
49/70 Test #49: minigzip-detect-text-A ...........   Passed    0.30 sec
      Start 50: minigzip-detect-binary-A
50/70 Test #50: minigzip-detect-binary-A .........   Passed    0.22 sec
      Start 51: GH-361
51/70 Test #51: GH-361 ...........................   Passed    0.21 sec
      Start 52: GH-364
52/70 Test #52: GH-364 ...........................   Passed    0.21 sec
      Start 53: GH-382
53/70 Test #53: GH-382 ...........................   Passed    0.09 sec
      Start 54: GH-536-segfault
54/70 Test #54: GH-536-segfault ..................   Passed    0.08 sec
      Start 55: GH-536-incomplete-read
55/70 Test #55: GH-536-incomplete-read ...........   Passed    0.09 sec
      Start 56: GH-536-zero-stored-block
56/70 Test #56: GH-536-zero-stored-block .........   Passed    0.08 sec
      Start 57: GH-751
57/70 Test #57: GH-751 ...........................   Passed    0.21 sec
      Start 58: GH-1600-no-window-check
58/70 Test #58: GH-1600-no-window-check ..........   Passed    0.03 sec
      Start 59: GH-1600-no-window-no-check
59/70 Test #59: GH-1600-no-window-no-check .......   Passed    0.03 sec
      Start 60: minigzip-file_compress
60/70 Test #60: minigzip-file_compress ...........   Passed    0.16 sec
      Start 61: minideflate-file_compress
61/70 Test #61: minideflate-file_compress ........   Passed    0.06 sec
      Start 62: minigzip-help
62/70 Test #62: minigzip-help ....................   Passed    0.02 sec
      Start 63: minigzip-invalid
63/70 Test #63: minigzip-invalid .................   Passed    0.02 sec
      Start 64: minideflate-help
64/70 Test #64: minideflate-help .................   Passed    0.02 sec
      Start 65: minideflate-invalid
65/70 Test #65: minideflate-invalid ..............   Passed    0.02 sec
      Start 66: switchlevels-help
66/70 Test #66: switchlevels-help ................   Passed    0.02 sec
      Start 67: makecrct
67/70 Test #67: makecrct .........................   Passed    0.15 sec
      Start 68: makefixed
68/70 Test #68: makefixed ........................   Passed    0.07 sec
      Start 69: maketrees
69/70 Test #69: maketrees ........................   Passed    0.07 sec
      Start 70: gtest_zlib
70/70 Test #70: gtest_zlib .......................   Passed    1.04 sec

@mtl1979
Copy link
Collaborator

mtl1979 commented Feb 26, 2024

Is CMake just suddenly failing the test for whether or not that compiler builtin exists and is now not defining this macro? That has to be it, somehow.

Wow, yeah, somehow that's it. Adding -DHAVE_ATTRIBUTE_ALIGNED to the cflags manually allows it to compile. That's really weird, the cmakelist is somehow bypassing this test.

You can remove the line starting with HAVE_ATTRIBUTE_ALIGNED:INTERNAL= from CMakeCache.txt in the build directory to force rerunning the test... Then verify from the CMake's build logs if the test passes or not. Simple grep -R "HAVE_ATTRIBUTE_ALIGNED" . will tell if that variable is mentioned anywhere in the build tree in addition to the cache file.

@KungFuJesus
Copy link
Contributor

./CMakeFiles/CMakeConfigureLog.yaml:      - "Performing Test HAVE_ATTRIBUTE_ALIGNED"
./CMakeFiles/CMakeConfigureLog.yaml:      variable: "HAVE_ATTRIBUTE_ALIGNED"
./CMakeFiles/CMakeConfigureLog.yaml:        /usr/bin/cc -DHAVE_ATTRIBUTE_ALIGNED  -std=c11 -o CMakeFiles/cmTC_e2feb.dir/src.c.o -c /home/adam/zlib-ng-unaligned/build/CMakeFiles/CMakeScratch/TryCompile-lRaiIX/src.c
./CMakeCache.txt://Test HAVE_ATTRIBUTE_ALIGNED
./CMakeCache.txt:HAVE_ATTRIBUTE_ALIGNED:INTERNAL=

@KungFuJesus
Copy link
Contributor

Something is up with this failing this cmake test, I'm not really sure why it's suddenly failing this test.

@mtl1979
Copy link
Collaborator

mtl1979 commented Mar 15, 2024

./CMakeFiles/CMakeConfigureLog.yaml

You should check all the relevant lines in ./CMakeFiles/CMakeConfigureLog.yaml ... It should give more details why the test fails.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Needs testing Please help test this optimization
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants