New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Faster query string decoder #13473
base: 4.1
Are you sure you want to change the base?
Faster query string decoder #13473
Conversation
@vietj too |
These are the numbers with these changes:
compared to before:
This is with
and
given that the most costy operation there is the component decoding, the improvements are ~5-30%: not bad, but in some next PR I can improve the rest too. |
63da020
to
cd1100d
Compare
@viet I've added a new inteface (bad name, to improve), which could save users which have to perform wdyt? @chrisvest I'm opened to ideas to provide a decent API here: impl wise I've preferred to do some "dirty trick" Perf-wise these is what we get with the new API:
Which remove the map allocation from the equation (that's unfair: users will likely have their own bespoke map there!) |
@chrisvest when the API and the PR is stable I'll write/modify the existing tests to use it. |
I Will perform some assembly analysis because some of the numbers are really weird and I suspect there is something going on inlining side. |
codec-http/src/main/java/io/netty/handler/codec/http/QueryStringDecoder.java
Show resolved
Hide resolved
codec-http/src/main/java/io/netty/handler/codec/http/QueryStringDecoder.java
Show resolved
Hide resolved
codec-http/src/main/java/io/netty/handler/codec/http/QueryStringDecoder.java
Outdated
Show resolved
Hide resolved
codec-http/src/main/java/io/netty/handler/codec/http/QueryStringDecoder.java
Outdated
Show resolved
Hide resolved
codec-http/src/main/java/io/netty/handler/codec/http/QueryStringDecoder.java
Outdated
Show resolved
Hide resolved
The same results on my Intel machine ( This pr:
4.1:
that are similar. |
3e6605c
to
a821f4e
Compare
97ad30f
to
1107a07
Compare
@chrisvest I didn't overloaded the static methods to perform parameter decoding yet: do you think it to be a nice addition? Test-wise, I've added an additional |
I have another improvement coming from the decodeComponent part, hold on! |
It would be great if we could have a static way of populating existing data structures, like public interface Populator<T> {
void setQueryParam(String name, String value, T map);
}
public class MapPopulator implements Populator<Map<String, String>> {
@Override
public void setQueryParam(String name, String value, Map<String, String> map) {
map.put(name, value);
}
} |
thanks @vietj for the feedback, it's a great idea indeed! |
1107a07
to
b3b27c3
Compare
After reading the different versions of the assembly produced by the benchmark I believe is it worthy to analyze what happen if the input sequence makes the JIT to decide for a different layout and order of the evaluated conditions (for the decoder state machine). Hence, I've added a parameter to "confuse" JIT, and these are the numbers...
In short: the improvement is almost there, in many/most of the cases. I'll now impl the changes suggested by @vietj and we're ready to go. |
a7a971f
to
7a6842f
Compare
303b2b2f2c9dacafb1044e0dd7a25e34a0fa0e3b has delivered another interesting speedup in almost all cases but
this seems worthy especially in case JIT branch profiling is polluted |
@vietj @chrisvest the only downside of unifying the map/callback (with context) approaches is that it always allocate an empty linked map in case no parameters are found, while on |
@franz1981 can you lazy create the collector and add a new method to create the collector on the parameter collector ? interface ParameterCollector<C> {
C newCollector();
...
} Otherwise use similar to |
@@ -191,16 +187,61 @@ public String path() { | |||
return path; | |||
} | |||
|
|||
@UnstableApi |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you can remove this annotation as its package-private anyway
/** | ||
* Decodes {@link #uri()} parameters, reporting them in the provided {@code collector}'s map. | ||
*/ | ||
public void decodeParameters(Map<String, List<String>> parameters) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think renaming the param makes things more clear
public void decodeParameters(Map<String, List<String>> parameters) { | |
public void decodeParameters(Map<String, List<String>> out) { |
* and just store/report/filter them assuming random ordering, instead. | ||
*/ | ||
public interface ParameterCollector<C> { | ||
void accept(String name, String value, C accumulator); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
void accept(String name, String value, C accumulator); | |
void collect(String name, String value, C accumulator); |
} | ||
|
||
private static int getFirstEscaped(String s, int from, int toExcluded, boolean isPath) { | ||
int cutOff = !isPath? '+' : '%'; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's make this easier to read
int cutOff = !isPath? '+' : '%'; | |
int cutOff = isPath? '%' : '+' ; |
public void accept(String name, String value, Map<String, List<String>> accumulator) { | ||
List<String> values = accumulator.get(name); | ||
if (values == null) { | ||
values = new ArrayList<String>(1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is 1 a good default ? Maybe use 4 so we dont need to expand directly
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've no idea - I would expect that query params usually have a single value for each param, most of the time, rather then more, but I have not much exp in it
return -1; | ||
} | ||
|
||
private static int skipIf(String s, int from, int to, int ch) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
private static int skipIf(String s, int from, int to, int ch) { | |
private static int skipIf(CharSequence s, int from, int to, int ch) { |
} | ||
|
||
private static boolean addParam(String s, int nameStart, int valueStart, int valueEnd, | ||
Map<String, List<String>> params, Charset charset) { | ||
private static int indexOf(String s, int from, int to, int ch) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
private static int indexOf(String s, int from, int to, int ch) { | |
private static int indexOf(CharSequence s, int from, int to, int ch) { |
return strBuf.toString(); | ||
} | ||
|
||
private static int findPathEndIndex(String uri) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
private static int findPathEndIndex(String uri) { | |
private static int findPathEndIndex(CharSequence uri) { |
Motivation: decodeParams is not accounting for common decoding scenarios making it highly unpredictable. Modifications: Arrange control flow to isolated predictable from unpredictable checks to improve CPU usage (stalled frontend cycles) Result: Faster query decoding
5275a10
to
329d1d5
Compare
@franz1981 what is the status of this PR ? |
Sorry @normanmaurer I have parked this due to others which required more attention; I will change it again to unify some of the existing code paths |
Motivation:
decodeParams is not accounting for common decoding scenarios
making it highly unpredictable.
Modifications:
Arrange control flow to isolated predictable from unpredictable
checks to improve CPU usage (stalled frontend cycles)
Result:
Faster query decoding