Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Performance][Optimization] Method inline question #1807

Open
nmi-relewise opened this issue Apr 22, 2024 · 3 comments
Open

[Performance][Optimization] Method inline question #1807

nmi-relewise opened this issue Apr 22, 2024 · 3 comments

Comments

@nmi-relewise
Copy link
Contributor

Dear gents, I'm trying to squeeze every ns in de/serialization process - we are operating on 100-GBs scale.

My attention was caught on how primitives are being read/inflated, saying MessagePackReader.Integers.cs

The op (internally) is to read bytes from span (aka SequenceReader) & increment position.
While it (MessagePackReader) is beef-ed with multiple method calls:

  • ReadInt32
  • ThrowInsufficientBufferUnless
  • TryReadBigEndian
  • TryRead

Keeping in mind absolute cost of method call being cheap/non-negative there could be a vision the flow is fine.
However, if we scale/compare to logical operation reader has to do = read from array in memory VS ((push a few args into reg + call method)*3) = costly.

A few methods are marked as subject for inline MethodImpl(MethodImplOptions.AggressiveInlining)].

It seem like a good idea to decorate more methods with inline option on the surface (considering they operate on same args).

The question is - were there any specific reasons why only a few inlined, or it is just codebase lifetime development (as-is)?

Cheers,
Nik

@nmi-relewise nmi-relewise changed the title [Performance][Inline] Question [Performance][Optimization] Method inline question Apr 22, 2024
@AArnott
Copy link
Collaborator

AArnott commented Apr 24, 2024

I don't know why the methods that are attributed were attributed. Inlining already happens by the JIT automatically in cases where it is obvious to the JIT that it would provide a useful improvement. We should only add attributes where the JIT isn't already inlining it yet we've measured perf and found it to make it faster to add the attribute.
If you have that data, I'd take a PR that adds the attributes.

@nmi-relewise
Copy link
Contributor Author

The methods are getting called ~300M times, in green:

image

The timings differ 49 vs 34 sec:
image

It seem to be a possible way go forward; however I'd ask collective brain to look for pitfalls in da approach.

Thanks.

@nmi-relewise
Copy link
Contributor Author

nmi-relewise commented May 6, 2024

The first challenge - how to measure accurately.
There are tier-ed compilations, therefore a point with JIT optimizations makes sense:

image

However, as you've mentioned, runtime to pick how/which calls to be inlined:

image

In case of .NET 7 the ReadInt32 method still gets a jmp (what is weird - with multiple different addresses).

Since we are effectively trying to measure the impact of 'extra' work on top of read int from span = super fast, any observation tools will introduce the noise = influence = explain why numbers are so different with dotTrace, and not that far in real life.

@AArnott

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants