New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(swarm): eliminating protocol cloning when nothing is happening #5026
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great stuff! I have some questions. Also, did you manage to benchmark this somehow?
I ll try few benchmark strategies and will see. This should improve, so long as protocols don't change with each poll. |
since we always insert all of the protocols to the hashset on each poll, it hinders the performance |
I am now using hashmap with booleans to compute the diff, so no need to collect the protocols.
|
finally this is results of benchmark on old code:
|
@thomaseizinger I am curios what you think about the way I benchmark it |
I realized am testing with very short protocol names so here is a little change
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the updates! I've left some comments :)
I am liking the direction this is going in and looking forward to merge the performance improvements!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the updates! I've left some more comments with questions and suggestions :)
@thomaseizinger, hey, did I miss something that still needs to be done? |
Sorry for the delay. I am on low availability until mid-Jan. Will give this a review after! :) |
Co-authored-by: Thomas Eizinger <thomas@eizinger.io>
@@ -1 +1,2 @@ | |||
target | |||
perf.* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is not required for the PR right?
FYI the new kid on the block of benchmarking doesn't need re-exports: https://nikolaivazquez.com/blog/divan/ |
Hey, @thomaseizinger, sorry for the delay, I was putting this off for a bit too long, here are my findings with criterion:
This is the result of reverting optimizations with 10000 protocols on one behavior (compared to the run with changes in this PR), In this case, a lot more code is being executed than just the connection handler which might be the reason the difference is smaller. Please review the benchmarking code, I am not 100% confident this is a good measurement. I'll also try making Tokio run in single-threaded mode if that makes a difference. |
does memory transport deadlock on single-threaded mode? |
welp, I can't find any reasonable difference now, I guess the protocol drops are not that significant when all the other code is run as well, so I was most likely measuring with perf incorrectly |
Okay, @thomaseizinger, so, I had a bug in my benchmark, where it did the computation only in the first iteration, that explains why nothing made sense. Here are the results relative to the optimized version with code actually running:
The scaling is crazy, I am also not sure if I still have bug in there. I will make the many behaviours few protocols too. |
Here are the results I measured when backporting the benchmark to the old code (relative to the code in this PR).
I am still suspicious of some of the gigantic performance differences, but this may be due to the optimized version avoiding protocol cloning. Full Results
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome work and great macro trickery!
I am afraid whatever you are currently benchmarking isn't what you think you are benchmarking. See the inline comment.
Benchmark results with applied modifications. Numbers are now more reasonable since both
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! These numbers make a bit more sense. I have two more comments that could simplify the benchmarking code a bit. I'd like to make sure we have as little complexity in there as possible. The restarting
state irks me a bit, I'd rather defer any form of setup to criterion to make sure we aren't messing up any setup of the benchmark.
Co-authored-by: Thomas Eizinger <thomas@eizinger.io>
I am currently travelling but will look at this in 2ish weeks time. |
Description
Code keeps the API while eliminating repetitive protocol cloning when protocols did not change, If protocol changes occur, only then the protocols are cloned to a reused buffer from which they are borrowed for iteration.
Notes & open questions
Change checklist