Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

optimize memory layout #39

Draft
wants to merge 9 commits into
base: dev
Choose a base branch
from

Conversation

michaelsippel
Copy link
Member

This PR implements some memory-layout optimizations which in turn will hopefully improve cache utilization.
The majority of bytes saved come from a custom implementation of reference-counting, where a smart-pointer reference only allocates a single pointer (8byte) instead of the fat 16-byte std::shared_ptr.
Additionally the members of some structs have been reordered to reduce space wasted by alignment.
Another place where some bytes are gained is to implement null-types directly instead of using std::optional (e.g. for EventPtr).

Minimal Task-Config (only Resource-, Graph- & IdProperty)

Type size old size optimized
Task 472 224
TaskProperties 336 208

Extended Task-Config (+ LabelProperty, SchedulingTag)

Type size old size optimized
Task 520 272
TaskProperties 384 264
Type size old size optimized
IDProperty 4 4
ResourceProperty 88 56
GraphProperty 240 152
Event 48 32
EventPtr 32 32
ChunkedList<uint8_t> 48 32
ChunkedList<uint8_t>::Item 6 6
AtomicList<uint8_t> 40 32
AtomicList<uint8_t>::ItemLink 32 40

@michaelsippel michaelsippel changed the title Topic optimize memory optimize memory layout Dec 7, 2023
@michaelsippel michaelsippel marked this pull request as draft December 8, 2023 08:48
@michaelsippel
Copy link
Member Author

There is still some bug that leaks memory at one place that we need to fix in this PR.

But a some quick preliminary benchmarks reveal that we might be on the right track here and we might be indeed limited by memory latency in the case of 64x64 tiles.

image
image
image

@michaelsippel michaelsippel force-pushed the topic-optimize-memory branch 7 times, most recently from f44d778 to e9a6c17 Compare December 13, 2023 23:51
@michaelsippel
Copy link
Member Author

Rebased this branch and with the latest patches for ChunkedList, the memory-leaks previously observed on this branch are gone now.

@michaelsippel michaelsippel force-pushed the topic-optimize-memory branch 2 times, most recently from c93e2f5 to 9aec140 Compare December 14, 2023 09:11
@psychocoderHPC
Copy link
Member

After this PR is in the mainline I will add a clang format file and auto-format the code.

@michaelsippel
Copy link
Member Author

michaelsippel commented Dec 14, 2023

Rebased again, and I'm not able to break it so far, all tests look fine.

Only concern I have now is the use of offsetof in

return (Task*) ( (uintptr_t)this - offsetof(Task, space) );
where I get the warning that this is 'conditionally supported' , but it seems to work fine.
Ideally we would need offsetof for base-classes.

@psychocoderHPC
Copy link
Member

This PR produces many address sanitizer issues.

=================================================================
1: ==46845==ERROR: LeakSanitizer: detected memory leaks
1: 
1: Direct leak of 360 byte(s) in 3 object(s) allocated from:
1:     #0 0x7fbdffab61e7 in operator new(unsigned long) ../../../../src/libsanitizer/asan/asan_new_delete.cpp:99
1:     #1 0x559086696289 in redGrapes::Context::init(unsigned long, std::shared_ptr<redGrapes::scheduler::IScheduler>) /home/widera/workspace/redGrapes/redGrapes/redGrapes.cpp:132
1:     #2 0x55908669647f in redGrapes::Context::init(unsigned long) /home/widera/workspace/redGrapes/redGrapes/redGrapes.cpp:140
1:     #3 0x5590865d49d3 in redGrapes::init(unsigned long) /home/widera/workspace/redGrapes/redGrapes/redGrapes.hpp:122
1:     #4 0x559086669d59 in test_worker_utilization(unsigned int) /home/widera/workspace/redGrapes/test/scheduler.cpp:20
1:     #5 0x55908666a61b in CATCH2_INTERNAL_TEST_0 /home/widera/workspace/redGrapes/test/scheduler.cpp:52
1:     #6 0x5590867a04da in Catch::TestInvokerAsFunction::invoke() const src/catch2/internal/catch_test_case_registry_impl.cpp:149
1:     #7 0x559086782f9f in Catch::TestCaseHandle::invoke() const (/home/widera/workspace/buildRedGrapes/redGrapes_test+0x22bf9f)
1:     #8 0x559086780717 in Catch::RunContext::invokeActiveTestCase() src/catch2/internal/catch_run_context.cpp:538
1:     #9 0x55908677fed0 in Catch::RunContext::runCurrentTest(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&) src/catch2/internal/catch_run_context.cpp:501
1:     #10 0x55908677b7b0 in Catch::RunContext::runTest(Catch::TestCaseHandle const&) src/catch2/internal/catch_run_context.cpp:232
1:     #11 0x5590866befbe in execute src/catch2/catch_session.cpp:110
1:     #12 0x5590866c1f40 in Catch::Session::runInternal() src/catch2/catch_session.cpp:332
1:     #13 0x5590866c1446 in Catch::Session::run() src/catch2/catch_session.cpp:263
1:     #14 0x5590866bcfab in int Catch::Session::run<char>(int, char const* const*) src/catch2/../catch2/catch_session.hpp:41
1:     #15 0x5590866bcd66 in main src/catch2/internal/catch_main.cpp:36
1:     #16 0x7fbdff229d8f in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
1: 
1: Direct leak of 120 byte(s) in 1 object(s) allocated from:
1:     #0 0x7fbdffab61e7 in operator new(unsigned long) ../../../../src/libsanitizer/asan/asan_new_delete.cpp:99
1:     #1 0x559086696289 in redGrapes::Context::init(unsigned long, std::shared_ptr<redGrapes::scheduler::IScheduler>) /home/widera/workspace/redGrapes/redGrapes/redGrapes.cpp:132
1:     #2 0x55908669647f in redGrapes::Context::init(unsigned long) /home/widera/workspace/redGrapes/redGrapes/redGrapes.cpp:140
1:     #3 0x5590865d49d3 in redGrapes::init(unsigned long) /home/widera/workspace/redGrapes/redGrapes/redGrapes.hpp:122
1:     #4 0x55908663131c in CATCH2_INTERNAL_TEST_0 /home/widera/workspace/redGrapes/test/resource_user.cpp:10
1:     #5 0x5590867a04da in Catch::TestInvokerAsFunction::invoke() const src/catch2/internal/catch_test_case_registry_impl.cpp:149
1:     #6 0x559086782f9f in Catch::TestCaseHandle::invoke() const (/home/widera/workspace/buildRedGrapes/redGrapes_test+0x22bf9f)
1:     #7 0x559086780717 in Catch::RunContext::invokeActiveTestCase() src/catch2/internal/catch_run_context.cpp:538
1:     #8 0x55908677fed0 in Catch::RunContext::runCurrentTest(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&) src/catch2/internal/catch_run_context.cpp:501
1:     #9 0x55908677b7b0 in Catch::RunContext::runTest(Catch::TestCaseHandle const&) src/catch2/internal/catch_run_context.cpp:232
1:     #10 0x5590866befbe in execute src/catch2/catch_session.cpp:110
1:     #11 0x5590866c1f40 in Catch::Session::runInternal() src/catch2/catch_session.cpp:332
1:     #12 0x5590866c1446 in Catch::Session::run() src/catch2/catch_session.cpp:263
1:     #13 0x5590866bcfab in int Catch::Session::run<char>(int, char const* const*) src/catch2/../catch2/catch_session.hpp:41
1:     #14 0x5590866bcd66 in main src/catch2/internal/catch_main.cpp:36
1:     #15 0x7fbdff229d8f in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
1: 
1: Direct leak of 120 byte(s) in 1 object(s) allocated from:
1:     #0 0x7fbdffab61e7 in operator new(unsigned long) ../../../../src/libsanitizer/asan/asan_new_delete.cpp:99
1:     #1 0x559086696289 in redGrapes::Context::init(unsigned long, std::shared_ptr<redGrapes::scheduler::IScheduler>) /home/widera/workspace/redGrapes/redGrapes/redGrapes.cpp:132
1:     #2 0x55908669647f in redGrapes::Context::init(unsigned long) /home/widera/workspace/redGrapes/redGrapes/redGrapes.cpp:140
1:     #3 0x5590865d49d3 in redGrapes::init(unsigned long) /home/widera/workspace/redGrapes/redGrapes/redGrapes.hpp:122
1:     #4 0x5590865cb17c in CATCH2_INTERNAL_TEST_2 /home/widera/workspace/redGrapes/test/resource.cpp:60
1:     #5 0x5590867a04da in Catch::TestInvokerAsFunction::invoke() const src/catch2/internal/catch_test_case_registry_impl.cpp:149
1:     #6 0x559086782f9f in Catch::TestCaseHandle::invoke() const (/home/widera/workspace/buildRedGrapes/redGrapes_test+0x22bf9f)
1:     #7 0x559086780717 in Catch::RunContext::invokeActiveTestCase() src/catch2/internal/catch_run_context.cpp:538
1:     #8 0x55908677fed0 in Catch::RunContext::runCurrentTest(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&) src/catch2/internal/catch_run_context.cpp:501
1:     #9 0x55908677b7b0 in Catch::RunContext::runTest(Catch::TestCaseHandle const&) src/catch2/internal/catch_run_context.cpp:232
1:     #10 0x5590866befbe in execute src/catch2/catch_session.cpp:110
1:     #11 0x5590866c1f40 in Catch::Session::runInternal() src/catch2/catch_session.cpp:332
1:     #12 0x5590866c1446 in Catch::Session::run() src/catch2/catch_session.cpp:263
1:     #13 0x5590866bcfab in int Catch::Session::run<char>(int, char const* const*) src/catch2/../catch2/catch_session.hpp:41
1:     #14 0x5590866bcd66 in main src/catch2/internal/catch_main.cpp:36
1:     #15 0x7fbdff229d8f in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
1: 
1: Direct leak of 120 byte(s) in 1 object(s) allocated from:
1:     #0 0x7fbdffab61e7 in operator new(unsigned long) ../../../../src/libsanitizer/asan/asan_new_delete.cpp:99
1:     #1 0x559086696289 in redGrapes::Context::init(unsigned long, std::shared_ptr<redGrapes::scheduler::IScheduler>) /home/widera/workspace/redGrapes/redGrapes/redGrapes.cpp:132
1:     #2 0x55908669647f in redGrapes::Context::init(unsigned long) /home/widera/workspace/redGrapes/redGrapes/redGrapes.cpp:140
1:     #3 0x5590865d49d3 in redGrapes::init(unsigned long) /home/widera/workspace/redGrapes/redGrapes/redGrapes.hpp:122
1:     #4 0x5590865c8f09 in CATCH2_INTERNAL_TEST_0 /home/widera/workspace/redGrapes/test/resource.cpp:41
1:     #5 0x5590867a04da in Catch::TestInvokerAsFunction::invoke() const src/catch2/internal/catch_test_case_registry_impl.cpp:149
1:     #6 0x559086782f9f in Catch::TestCaseHandle::invoke() const (/home/widera/workspace/buildRedGrapes/redGrapes_test+0x22bf9f)
1:     #7 0x559086780717 in Catch::RunContext::invokeActiveTestCase() src/catch2/internal/catch_run_context.cpp:538
1:     #8 0x55908677fed0 in Catch::RunContext::runCurrentTest(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&) src/catch2/internal/catch_run_context.cpp:501
1:     #9 0x55908677b7b0 in Catch::RunContext::runTest(Catch::TestCaseHandle const&) src/catch2/internal/catch_run_context.cpp:232
1:     #10 0x5590866befbe in execute src/catch2/catch_session.cpp:110
1:     #11 0x5590866c1f40 in Catch::Session::runInternal() src/catch2/catch_session.cpp:332
1:     #12 0x5590866c1446 in Catch::Session::run() src/catch2/catch_session.cpp:263
1:     #13 0x5590866bcfab in int Catch::Session::run<char>(int, char const* const*) src/catch2/../catch2/catch_session.hpp:41
1:     #14 0x5590866bcd66 in main src/catch2/internal/catch_main.cpp:36
1:     #15 0x7fbdff229d8f in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
1: 
1: SUMMARY: AddressSanitizer: 720 byte(s) leaked in 6 allocation(s).
1/1 Test #1: unittest .........................***Failed    2.61 sec

@psychocoderHPC
Copy link
Member

Rebased again, and I'm not able to break it so far, all tests look fine.

Only concern I have now is the use of offsetof in

return (Task*) ( (uintptr_t)this - offsetof(Task, space) );

where I get the warning that this is 'conditionally supported' , but it seems to work fine.

If we inherit from memory::Refcounted< TaskSpace, TaskSpaceDeleter >::Guard we should be able to static cast *this to get access to the element.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants