New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize HpackStaticTable by using a perfect hash function #12713
Optimize HpackStaticTable by using a perfect hash function #12713
Conversation
Motivation: HpackStaticTable performance can be improved by using a perfect hash function. Modifications: Use 2 tables, one for mapping header name -> index and one for mapping header name + value -> index. Choose the tables and the hash function in such a way that each entry maps to a single hash bucket. Results: Benchmark (optimized) Mode Cnt Score Error Units HpackStaticTableBenchmark.lookupHttp false avgt 10 15.998 ± 1.646 ns/op HpackStaticTableBenchmark.lookupHttp true avgt 10 10.457 ± 0.274 ns/op HpackStaticTableBenchmark.lookupHttps false avgt 10 20.942 ± 1.365 ns/op HpackStaticTableBenchmark.lookupHttps true avgt 10 10.618 ± 0.138 ns/op HpackStaticTableBenchmark.lookupNameOnlyMatch false avgt 10 13.710 ± 0.273 ns/op HpackStaticTableBenchmark.lookupNameOnlyMatch true avgt 10 3.156 ± 0.052 ns/op HpackStaticTableBenchmark.lookupNoNameMatch false avgt 10 3.528 ± 0.047 ns/op HpackStaticTableBenchmark.lookupNoNameMatch true avgt 10 3.145 ± 0.031 ns/op Caveats: This implementation couples HpackStaticTable implementation to the implementation of AsciiString.hashCode, relying on the values it returns for the static table headers to yield a perfect hash function. If AsciiString.hashCode implementation changes, HpackStaticTable implementation will also need to change. Moreover, if AsciiString.hashCode can return different values on different platforms (maybe due to endianness) or in general in different jvm instances, then it invalidates the approach taken here (or at least makes its implementation much more complex).
…in API compatible.
…er generated synthetic methods.
Analyzing HpackEncoder code, we cannot reach the relevant HpackStaticTable methods with null name or value.
I don't think |
@chrisvest if you execute
The reason is that the unsafe implementation I'm sure @Scottmitch can explain further, as he implemented that. BTW, are there any CI builds running on big endian machines? |
@amirhadadi Did you run that on a BE system? You can't just change But no, I don't think we have any CI jobs that run BE. |
@chrisvest originally I didn't test it on a BE system, but after you asked I managed to use the approach outlined here to test it on an emulated s390x architecture.
Regarding
You are 100% right that just flipping this flag will not emulate a big endian system, but I wanted to check something very specific. I only wanted to check what values
So it at least appears to seem correct to just flip |
@chrisvest so you are happy with that ? |
@ejona86 can you check as well ? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One nit
codec-http2/src/main/java/io/netty/handler/codec/http2/HpackStaticTable.java
Outdated
Show resolved
Hide resolved
Co-authored-by: Norman Maurer <norman_maurer@apple.com>
@normanmaurer yeah, I'm happy with the reply. I'll make a note to fix the hashcode so BE computes the same as LE. |
I would love to have @ejona86 to at least double check as well |
Let's merge this |
Motivation: HpackStaticTable performance can be improved by using a perfect hash function. Modifications: Use 2 tables, one for mapping header name -> index and one for mapping header name + value -> index. Choose the tables and the hash function in such a way that each entry maps to a single hash bucket. Results: Benchmark (optimized) Mode Cnt Score Error Units HpackStaticTableBenchmark.lookupHttp false avgt 10 15.998 ± 1.646 ns/op HpackStaticTableBenchmark.lookupHttp true avgt 10 10.457 ± 0.274 ns/op HpackStaticTableBenchmark.lookupHttps false avgt 10 20.942 ± 1.365 ns/op HpackStaticTableBenchmark.lookupHttps true avgt 10 10.618 ± 0.138 ns/op HpackStaticTableBenchmark.lookupNameOnlyMatch false avgt 10 13.710 ± 0.273 ns/op HpackStaticTableBenchmark.lookupNameOnlyMatch true avgt 10 3.156 ± 0.052 ns/op HpackStaticTableBenchmark.lookupNoNameMatch false avgt 10 3.528 ± 0.047 ns/op HpackStaticTableBenchmark.lookupNoNameMatch true avgt 10 3.145 ± 0.031 ns/op Caveats: This implementation couples HpackStaticTable implementation to the implementation of AsciiString.hashCode, relying on the values it returns for the static table headers to yield a perfect hash function. If AsciiString.hashCode implementation changes, HpackStaticTable implementation will also need to change. Moreover, if AsciiString.hashCode can return different values on different platforms (maybe due to endianness) or in general in different jvm instances, then it invalidates the approach taken here (or at least makes its implementation much more complex). Co-authored-by: ahadadi <ahadadi@outbrain.com> Co-authored-by: Norman Maurer <norman_maurer@apple.com>
@amirhadadi thanks for all the hard work on this one ! |
Motivation:
HpackStaticTable performance can be improved by using a perfect hash function.
Modifications:
Use 2 tables, one for mapping header name -> index and one for mapping header name + value -> index.
Choose the tables size and the hash function in such a way that each hash bucket contains a single entry.
Results:
Caveats:
This implementation couples HpackStaticTable implementation to the implementation of AsciiString.hashCode, relying on the values it returns for the static table headers to yield a perfect hash function. If AsciiString.hashCode implementation changes, HpackStaticTable implementation will also need to change. Moreover, if AsciiString.hashCode can return different values on different jvm instances (for reasons other than endianness), then it invalidates the approach taken here (or at least makes its implementation much more complex).