Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Span<char>-ifying ActorPath and Address parsing #5030

Closed

Conversation

Aaronontheweb
Copy link
Member

Inspired by @Arkatufus's excellent work on #5028 thus far, I decided to take a stab at performance optimizing some of the primitives in Akka.Actor that get used heavily in the hot path of the Akka.Remote deserialization pipeline.

@Aaronontheweb
Copy link
Member Author

Aaronontheweb commented May 21, 2021

Working on optimizing ActorPath and Address parsing operations

ActorPath Before

BenchmarkDotNet=v0.13.0, OS=Windows 10.0.19041.985 (2004/May2020Update/20H1)
AMD Ryzen 7 1700, 1 CPU, 16 logical and 8 physical cores
.NET SDK=5.0.203
  [Host]     : .NET Core 3.1.15 (CoreCLR 4.700.21.21202, CoreFX 4.700.21.21402), X64 RyuJIT
  DefaultJob : .NET Core 3.1.15 (CoreCLR 4.700.21.21202, CoreFX 4.700.21.21402), X64 RyuJIT

Method Mean Error StdDev Gen 0 Gen 1 Gen 2 Allocated
ActorPath_Parse 1,938.91 ns 5.334 ns 4.454 ns 0.2670 - - 1,128 B
ActorPath_Concat 59.56 ns 0.876 ns 0.819 ns 0.0421 - - 176 B
ActorPath_Equals 24.97 ns 0.090 ns 0.080 ns - - - -
ActorPath_ToString 125.80 ns 1.200 ns 1.122 ns 0.0210 - - 88 B

ActorPath After

BenchmarkDotNet=v0.13.0, OS=Windows 10.0.19041.985 (2004/May2020Update/20H1)
AMD Ryzen 7 1700, 1 CPU, 16 logical and 8 physical cores
.NET SDK=5.0.203
  [Host]     : .NET Core 3.1.15 (CoreCLR 4.700.21.21202, CoreFX 4.700.21.21402), X64 RyuJIT
  DefaultJob : .NET Core 3.1.15 (CoreCLR 4.700.21.21202, CoreFX 4.700.21.21402), X64 RyuJIT

Method Mean Error StdDev Gen 0 Gen 1 Gen 2 Allocated
ActorPath_Parse 1,591.19 ns 27.894 ns 26.092 ns 0.1678 - - 704 B
ActorPath_Concat 60.81 ns 1.260 ns 1.451 ns 0.0421 - - 176 B
ActorPath_Equals 25.14 ns 0.224 ns 0.175 ns - - - -
ActorPath_ToString 125.80 ns 1.980 ns 1.755 ns 0.0210 - - 88 B

@Aaronontheweb
Copy link
Member Author

Nevermind, they're on the same platform

@Aaronontheweb
Copy link
Member Author

ActorCell.SplitNameAndUid Before

BenchmarkDotNet=v0.13.0, OS=Windows 10.0.19041.985 (2004/May2020Update/20H1)
AMD Ryzen 7 1700, 1 CPU, 16 logical and 8 physical cores
.NET SDK=5.0.203
  [Host]     : .NET Core 3.1.15 (CoreCLR 4.700.21.21202, CoreFX 4.700.21.21402), X64 RyuJIT
  DefaultJob : .NET Core 3.1.15 (CoreCLR 4.700.21.21202, CoreFX 4.700.21.21402), X64 RyuJIT

Method Mean Error StdDev Gen 0 Gen 1 Gen 2 Allocated
ActorCell_SplitNameAndUid 79.37 ns 1.267 ns 1.185 ns 0.0248 - - 104 B

@Aaronontheweb
Copy link
Member Author

ActorPath After

Lost most of my performance gains after the bug fix

BenchmarkDotNet=v0.13.0, OS=Windows 10.0.19041.985 (2004/May2020Update/20H1)
AMD Ryzen 7 1700, 1 CPU, 16 logical and 8 physical cores
.NET SDK=5.0.203
  [Host]     : .NET Core 3.1.15 (CoreCLR 4.700.21.21202, CoreFX 4.700.21.21402), X64 RyuJIT
  DefaultJob : .NET Core 3.1.15 (CoreCLR 4.700.21.21202, CoreFX 4.700.21.21402), X64 RyuJIT

Method Mean Error StdDev Gen 0 Gen 1 Gen 2 Allocated
ActorPath_Parse 1,884.19 ns 37.562 ns 35.136 ns 0.2670 - - 1,120 B
ActorPath_Concat 63.09 ns 1.302 ns 3.475 ns 0.0421 - - 176 B
ActorPath_Equals 25.65 ns 0.549 ns 0.821 ns - - - -
ActorPath_ToString 127.35 ns 1.727 ns 1.615 ns 0.0210 - - 88 B

@Aaronontheweb
Copy link
Member Author

Aaronontheweb commented May 21, 2021

Back in business - eliminated List<string> allocation during parsing

BenchmarkDotNet=v0.13.0, OS=Windows 10.0.19041.985 (2004/May2020Update/20H1)
AMD Ryzen 7 1700, 1 CPU, 16 logical and 8 physical cores
.NET SDK=5.0.203
  [Host]     : .NET Core 3.1.15 (CoreCLR 4.700.21.21202, CoreFX 4.700.21.21402), X64 RyuJIT
  DefaultJob : .NET Core 3.1.15 (CoreCLR 4.700.21.21202, CoreFX 4.700.21.21402), X64 RyuJIT
Method Mean Error StdDev Gen 0 Gen 1 Gen 2 Allocated
ActorPath_Parse 1,646.94 ns 15.787 ns 14.767 ns 0.2365 - - 992 B
ActorPath_Concat 57.60 ns 0.727 ns 0.680 ns 0.0421 - - 176 B
ActorPath_Equals 25.03 ns 0.106 ns 0.099 ns - - - -
ActorPath_ToString 128.14 ns 2.511 ns 2.349 ns 0.0210 - - 88 B

@Aaronontheweb
Copy link
Member Author

Looks like I still have a bug in the ActorPath parsing code that I can reproduce in the test suite. Going to work on solving that.

@Aaronontheweb
Copy link
Member Author

All of the previously failing test cases are now passing - going to take a new benchmark measurement.

@Aaronontheweb
Copy link
Member Author

Final numbers, complete with bug fixes:

BenchmarkDotNet=v0.13.0, OS=Windows 10.0.19041.985 (2004/May2020Update/20H1)
AMD Ryzen 7 1700, 1 CPU, 16 logical and 8 physical cores
.NET SDK=5.0.203
  [Host]     : .NET Core 3.1.15 (CoreCLR 4.700.21.21202, CoreFX 4.700.21.21402), X64 RyuJIT
  DefaultJob : .NET Core 3.1.15 (CoreCLR 4.700.21.21202, CoreFX 4.700.21.21402), X64 RyuJIT

Method Mean Error StdDev Gen 0 Gen 1 Gen 2 Allocated
ActorPath_Parse 1,674.83 ns 19.033 ns 17.804 ns 0.2365 - - 992 B
ActorPath_Concat 60.63 ns 0.496 ns 0.464 ns 0.0421 - - 176 B
ActorPath_Equals 25.06 ns 0.066 ns 0.051 ns - - - -
ActorPath_ToString 127.25 ns 0.822 ns 0.686 ns 0.0210 - - 88 B

Breaking API change - removed `NameAndUid` from the public API and made it into a `readonly struct`.

API wasn't used anywhere except internally anyway - should probably have been a `ValueTuple`.
@Aaronontheweb
Copy link
Member Author

Performance data for ActorPath after removing the old NameAndUid calls and replacing them with a ValueTuple that does the exact same work

BenchmarkDotNet=v0.13.0, OS=Windows 10.0.19041.985 (2004/May2020Update/20H1)
AMD Ryzen 7 1700, 1 CPU, 16 logical and 8 physical cores
.NET SDK=5.0.203
  [Host]     : .NET Core 3.1.15 (CoreCLR 4.700.21.21202, CoreFX 4.700.21.21402), X64 RyuJIT
  DefaultJob : .NET Core 3.1.15 (CoreCLR 4.700.21.21202, CoreFX 4.700.21.21402), X64 RyuJIT

Method Mean Error StdDev Gen 0 Gen 1 Gen 2 Allocated
ActorPath_Parse 1,672.00 ns 9.184 ns 7.669 ns 0.2136 - - 896 B
ActorPath_Concat 54.91 ns 1.128 ns 1.299 ns 0.0268 - - 112 B
ActorPath_Equals 26.08 ns 0.095 ns 0.085 ns - - - -
ActorPath_ToString 119.24 ns 0.835 ns 0.740 ns 0.0210 - - 88 B

@Aaronontheweb
Copy link
Member Author

NameAndUid performance comparison:

BenchmarkDotNet=v0.13.0, OS=Windows 10.0.19041.985 (2004/May2020Update/20H1)
AMD Ryzen 7 1700, 1 CPU, 16 logical and 8 physical cores
.NET SDK=5.0.203
  [Host]     : .NET Core 3.1.15 (CoreCLR 4.700.21.21202, CoreFX 4.700.21.21402), X64 RyuJIT
  DefaultJob : .NET Core 3.1.15 (CoreCLR 4.700.21.21202, CoreFX 4.700.21.21402), X64 RyuJIT

Method Mean Error StdDev Gen 0 Gen 1 Gen 2 Allocated
ActorCell_SplitNameAndUid 78.66 ns 1.396 ns 1.305 ns 0.0248 - - 104 B
ActorCell_GetNameAndUid 69.51 ns 1.435 ns 1.866 ns 0.0172 - - 72 B

@Aaronontheweb
Copy link
Member Author

Performance data using the new SpanHacks class for integer parsing:

BenchmarkDotNet=v0.13.0, OS=Windows 10.0.19041.985 (2004/May2020Update/20H1)
AMD Ryzen 7 1700, 1 CPU, 16 logical and 8 physical cores
.NET SDK=5.0.203
  [Host]     : .NET Core 3.1.15 (CoreCLR 4.700.21.21202, CoreFX 4.700.21.21402), X64 RyuJIT
  DefaultJob : .NET Core 3.1.15 (CoreCLR 4.700.21.21202, CoreFX 4.700.21.21402), X64 RyuJIT

Method Mean Error StdDev Gen 0 Gen 1 Gen 2 Allocated
ActorCell_SplitNameAndUid 74.66 ns 1.479 ns 1.311 ns 0.0248 - - 104 B
ActorCell_GetNameAndUid 42.12 ns 0.569 ns 0.532 ns 0.0076 - - 32 B

@Aaronontheweb Aaronontheweb marked this pull request as ready for review May 24, 2021 13:59
@Aaronontheweb
Copy link
Member Author

This PR is ready for review.

@Aaronontheweb
Copy link
Member Author

closed in favor of #5039

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant