Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[UNDERTOW-2303] Introduced a new, faster utility for routing path templates #1566

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

dirkroets
Copy link

@dirkroets dirkroets commented Mar 8, 2024

Objective

Improve the performance of routing path templates to target handlers.

I started working on this code after I had seen this comment by Stuart Douglas on io.undertow.util.PathTemplateMatcher:

TODO: we can probably do this faster using a trie type structure, but I think the current impl should perform ok most of the time

The new path template router does incorporate a tree, but also numerous other performance enhancements. There are rudimentary performance benchmarks in io.undertow.util.PathTemplateRouterTest. The @Test annotation must be uncommented for public void comparePerformance() in order to run the comparison - in which case the results will be written to /tmp/path-template-router-performance.txt in CSV format. The attached chart
is based on numerous runs of comparePerformance and even though the benchmarks are admittedly rudimentary, it does show a very significant improvement in performance. The new utility routes approximately 3x more requests in the same amount of time as the previous utility, therefore a 200% increase in performance. Complexity should be O(log n) for number of path templates added to the router:

performance-chart

Jira

UNDERTOW-2303

Tests

  1. io.undertow.util.PathTemplateRouterTest contains tests to verify that requests are routed correctly based on different combinations of path templates, including wildcards. Some of these tests were specifically written during the development of io.undertow.util.PathTemplateRouter, but many of the assertions have been copied from io.undertow.util.PathTemplateTestCase to verify consistency against the previous implementation.
  2. io.undertow.server.handlers.PathTemplateHandlerTestCase was left untouched and verifies that io.undertow.server.handlers.PathTemplateHandler (which now uses the new router under the hood) routes requests in a way that is consistent with the previous implementation.
  3. io.undertow.server.handlers.RoutingHandlerTestCase was left untouched and verifies that io.undertow.server.RoutingHandler (which now uses the new router under the hood) routes requests in a way that is consistent with the previous implementation.

…plates. Updated existing path template based routing handlers to use the new utility.
@fl4via fl4via added the enhancement Enhances existing behaviour or code label Mar 13, 2024
@baranowb baranowb added under verification Currently being verified (running tests, reviewing) before posting a review to contributor failed CI Introduced new regession(s) during CI check waiting peer review PRs that edit core classes might require an extra review labels Mar 14, 2024
@baranowb
Copy link
Contributor

@dirkroets its failing with:
Error: Medium: Dead store to pathTemplate in io.undertow.server.handlers.PathTemplateRouterHandler.handleRequest(HttpServerExchange) [io.undertow.server.handlers.PathTemplateRouterHandler] At PathTemplateRouterHandler.java:[line 62] DLS_DEAD_LOCAL_STORE
Error: Medium: Unread field: io.undertow.util.PathTemplateRouter$RouterFactory.currentSegment [io.undertow.util.PathTemplateRouter$RouterFactory] At PathTemplateRouter.java:[line 1439] URF_UNREAD_FIELD
Error: High: Non-virtual method call in io.undertow.util.PathTemplateRouter$SimpleBuilder.newBuilder() passes null for non-null parameter of newBuilder(Object) [io.undertow.util.PathTemplateRouter$SimpleBuilder] At PathTemplateRouter.java:[line 2087] NP_NULL_PARAM_DEREF_NONVIRTUAL

This means that spotbugs has a problem with it. Enable it locally with "-Dfindbugs".

Copy link
Contributor

@baranowb baranowb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will look at rest of the code once I get some free time.

Objects.requireNonNull(uriTemplate);
Objects.requireNonNull(handler);

// Router builders are not thread-safe, so we need to synchronize.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a quick glance:
1.Why would the need to be?
2. in which cases this would be needed?
3. Im not sure if such design of builder is optimal. AFAIR in other builders are context one-offs
4. AFAIR, in general, if there is no need to sync, we dont do it, rather rely on local var/single failure.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The short answer is for backwards compatibility. The original PathTemplateHandler class (also PathTemplateRouter) provided synchronised mutation of the underlying routes.

The longer answer:

  1. The underlying router itself is immutable and therefore thread-safe. The builder that creates the router is not thread-safe as it is intended to be used from a single thread to instantiate a router. For most use cases that I can think of the routes (path templates) should be discovered/added during start-up of a service. Once discovered, a router can be built and then used to serve many many requests. Therefore, I believe that in most services instantiating a router is a once-off process / invocation vs probably millions of invocations of the route method. For this reason I believe that overall performance is served well by a slightly more expensive (computationally) instantiation process vs a very cheap routing process. Yet, this does not mean that calling the builder must be synchronised.... in my opinion most use cases must instantiate a router and pass it to an immutable handler such as the new PathTemplateRouterHandler class - which I provided as an example. We can add/offer builders for handlers - for convenience sake - that do the same thing as PathTemplateHandler and PathTemplateRouter, but that create immutable instances of those handlers instead of the current implementations that support mutating the underlying routers. This would however be a breaking change for developers who use and mutate the current implementations, so my recommendation would be to leave the current implementations as they are - for backwards compatibility - to deprecate them and to offer the new pattern as new, immutable handler classes that developers can switch to.
  2. I believe there are very very few use cases that require synchronised mutation of the routing handlers. For those use cases, developers should implement their own synchronised mutation of underlying handlers. As I have mentioned before, I just implemented the synchronised mutation to be consistent with the previous implementations in order to avoid introducing breaking changes.
  3. I agree that this is not optimal, but I am not sure if we are prepared to introduce this as a breaking change?
  4. Agreed. In this case there should be no need to sync for 99%+ of use cases.

As a last note - in case it is not that obvious from the code and in case we want to keep the backwards compatibility - the synchronisation only happens when the underlying routers are modified, so when new path templates are added. It only synchronises the mutations amongst multiple threads trying to mutate the routes concurrently. The actual routing of requests are never synchronised - not even when other threads are busy mutating the routers. For this reason I believe that it is okay to leave the synchronisation in order to maintain backwards compatibility. And then perhaps to deprecate and offer alternative classes in a next PR?

if (paths.size() == 1) {
return "path-template( " + paths.toArray()[0] + " )";
final List<PathTemplateRouter.PatternEqualsAdapter<PathTemplateRouter.Template<Supplier<HttpHandler>>>> templates;
synchronized (lock) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

toString impeding normal operation is not ideal.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. I have updated this. I believe that this toString method is probably only used for debugging purposes due to the size of the string produced (Consistent with the original implementation). This may now produce results that aren't entirely consistent when called concurrently with routes being mutated, but that is also consistent with how the original implementation worked.

@@ -0,0 +1,72 @@
/*
* JBoss, Home of Professional Open Source.
* Copyright 2014 Red Hat, Inc., and individual contributors
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, 2014, looks like we are traveling back in time?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I just copied and pasted the header from another class and missed the instruction to update the dates. Will keep this in mind in future. Will be fixed when I update the PR

* A handler that matches URI templates.
*
* @author Dirk Roets dirkroets@gmail.com
* @since 2023-07-20
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if we are supposed to be so pedantic to require bump on this as well. @fl4via will know.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dirkroets @baranowb because it will be merged in 2024, I think we should either bump it or just omit it, keeping in mind that we are going to have 2024 in the copyright header above.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fl4via @baranowb , I will remove it from the new classes. The since tags are only there because my IDE adds it for some other projects that I'm working on.

@fl4via
Copy link
Member

fl4via commented Mar 15, 2024

@dirkroets Just FYI, to fully verify CI in your machine run:
mvn clean package -Pproxy -Dmaven.test.failure.ignore=true -DfailIfNoTests=false -fae > output.txt
You will notice that it is running the spotbugs plugin, and you will also be able to see any errors that can arise in the different test modes.

@fl4via fl4via added the waiting PR update Awaiting PR update(s) from contributor before merging label Mar 15, 2024
@fl4via
Copy link
Member

fl4via commented Mar 15, 2024

@dirkroets Just FYI, to fully verify CI in your machine run: mvn clean package -Pproxy -Dmaven.test.failure.ignore=true -DfailIfNoTests=false -fae > output.txt You will notice that it is running the spotbugs plugin, and you will also be able to see any errors that can arise in modes.

I stand corrected, to really fully verify CI in your machine you will also need to run the tests with IPv6 option, however, you need to do this only if your fix editted something related to ipv6, which usually is not the case:
mvn clean package -Pproxy -Dmaven.test.failure.ignore=true -DfailIfNoTests=false -fae -Dtest.ipv6=true > outputIpv6.txt

…eaders to 2024. Removed synchronisation for toString(). Fixed Spotbugs issues.
@dirkroets
Copy link
Author

@fl4via @baranowb I have added a second commit to this PR that fixes the CI issues and with some documentation improvements. According to the documentation one can edit the previous commit - provided that the previous commit was not a big commit. I am not sure what qualifies as a big commit, so I played it safe and added a second commit.

@baranowb baranowb removed the failed CI Introduced new regession(s) during CI check label Apr 3, 2024
@baranowb
Copy link
Contributor

baranowb commented Apr 3, 2024

@dirkroets Cool. I will try to reserve some time to crunch through this change.

@dirkroets
Copy link
Author

Thanks, @baranowb . Let me know if you have any questions or if there is anything I can assist with.

@baranowb
Copy link
Contributor

I slowly make progress with this. I will most likely have bigger window to go through bulk of it in ~2weeks.
@dirkroets ^^

@baranowb
Copy link
Contributor

baranowb commented May 7, 2024

Hey. Sorry for delay, I wasnt able to login last week. Anyway. Pretty neat design. Im going to add general comment here and some inline with review. I did not copy/paste comments when it either was done above or is part of general review.
So:

  1. code formatting and xml editor stuff --- https://github.com/wildfly/wildfly-core/tree/main/ide-configs/eclipse
  2. router.apply - is there a hidden meaning behind method name? why not "match" ? (or just backward compat/sentiment). Im fairly sure XML meta-description should be purged?

Maintaintability: 
2. TODO: jdoc and some rudimentary description - example what is template in RoutingHandler.add ? Especially given warning  at start of PathTemplateRouter

2a)  Example values, concepts need to be explained, ie for PathTemplateRouter.Template + usage/expected results

  1. add description to map/stream ops - usually those are "well known" during design, but later on it becaomse hasle to maintain without hint - example  createAllMethodsRouter

  2. To add to above difference parametrization and difference between invocation of parseTemplate in getOrAddMethodRoutingMatchBuilder and createAddTemplateIfAbsentConsumer - seems like Supplier is just used as non null ref when actual path/routing has been not defined.

  3. typos :) "Leave" -> "Leaf" ?

  4. generally by the time "Leave" typo comes into play, hefty man page would be a good idea.7. PathTemplateRouter should possibly be split into package level classes if its supposed to be hermetic? 

@baranowb baranowb self-requested a review May 7, 2024 05:42
* was specifically to provide something that is very fast even at the expense of maintainable code. The router
* does a very simple thing and there should arguably be no need to constantly work on this code.
*/
public class PathTemplateRouter {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Confusing declaration, there is inteface called Router, yet, PathTemplateRouter does not implement it?

//<editor-fold defaultstate="collapsed" desc="PatternElement inner class">
/**
* Interface for elements that represent a URL path pattern. The objective of this interface is to provide
* a contract for comparing path patterns. For example:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How is this contract defined?

*
* @param <T> Type of pattern elements.
*/
public static final class PatternEqualsAdapter<T extends PatternElement> implements
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a need for parametrization?
Also, AFAIR, is used in most of classes here, classes that parameters ( I think) dont relate, this is tad confusing?

*
* Extensions of this class must be immutable.
*/
private abstract static class TemplateSegment implements PatternElement {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. examples, relations.
  2. TemplateSegment vs PatternElement - most likely some explanation would suffice? naming scheme just does not fall into place?

/**
* Index for the segment inside of the template.
*/
protected final int segmentIdx;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is it? why must be greater than 0 ?


/**
* A simple router that routes paths containing static segments or parameters. Specifically this router has
* an optimisation - based on the segment counts of requests - that do not support wild cards.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be good to explain said optimization?


@Override
public String toString() {
return "SimpleRouter{" + '}';
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

?

*
* @param <T> Target type.
*/
private static class MatcherLeaveArterfacts<T> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. 'Leaf' ?
  2. explanation/visualization? This, as few other defs are key innerowking classes, it has to be described in way that will allow maintenance.

private Matcher[] matchers;
private Matcher[] wildCardMatchers;

private RouterFactory(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here and most likely any other code below.

*
* @return Reference to the builder.
*/
public Builder<S, T> removeTemplate(final String pathTemplate) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not ideal to parse again. Possibly add method to remove by reference?

@dirkroets
Copy link
Author

Hi @baranowb

A general question from my side... what would be the best way to respond to requests for clarification? I.e.

  1. Clarify by submitting responses to comments here in the PR; or
  2. Clarify by adding additional comments in the code and updating the PR; or
  3. Clarify by creating and updating a separate man page. If this is the preferred option, then I would appreciate some advice on the best format and where to put it.
  4. Combination of the above

Thanks for all your effort with this PR!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Enhances existing behaviour or code under verification Currently being verified (running tests, reviewing) before posting a review to contributor waiting peer review PRs that edit core classes might require an extra review waiting PR update Awaiting PR update(s) from contributor before merging
Projects
None yet
3 participants