Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HTTP client: Expose maximum time to keep alive idle connection with persistent connections #488

Open
pbrisson opened this issue Feb 9, 2023 · 5 comments

Comments

@pbrisson
Copy link

pbrisson commented Feb 9, 2023

Hi there!

Good job with taking care of #468. Having the ability to use persistent connections is definitely a game changer for some use cases!

I was wondering if it would be a good idea to expose the $maximumTimeToKeepAliveIdleConnection variable (https://github.com/reactphp/http/blob/1.x/src/Io/ClientConnectionManager.php line 39) in order to support ongoing communication with a webserver, especially with REST API and architectures such as OpenResty.

I'm aware it represents some work as a lot of objects have to be exposed (client, transaction, sender, etc.), but I was wondering if it would be worth it for that particular use case. I noticed an improvement of 100ms per chained request on average by changing the timeout from 1ms to 10s.

Is there a reason why connections are considered idle only under 1ms? I think it might be too low (which is the reason why I'm flagging this as an issue and not a suggestion).

Thanks a lot for ReactPHP!

@pbrisson pbrisson added the bug label Feb 9, 2023
@WyriHaximus
Copy link
Member

Hey @pbrisson!

So I fully agree this should be a configuration option. However, there are a few things at play why we didn't go for it out of the box. First of all, as @clue mentioned in #486, we first want to gather some feedback before enabling this by default. The other side of this is we need to come up with a decent and future proof way of passing in configuration.

While reviewing this I also flagged this with @clue that for now and he agreed we need to do something about this. But for now to first get it in and gather production feedback on this feature before tweaking it. If you lower the number to 1s or 0.5s. At which point do you gain/lose that 100ms improvement?

Another thing is that if we raise it to a certain level the loop stays active and we probably should add a close method to the Browser to instruct it to close those idle connections.

@SimonFrings
Copy link
Member

So I fully agree this should be a configuration option.

I agree what @WyriHaximus said here, it makes a lot of sense to introduce a configuration for the $maximumTimeToKeepAliveIdleConnection. I worked together with @clue on the HTTP keep-alive feature and we were already aware of this being the next logical addition after filing the pull request.

Is there a reason why connections are considered idle only under 1ms? I think it might be too low (which is the reason why I'm flagging this as an issue and not a suggestion).

@clue mentioned a reason for this in #486, saying:

Additionally, idle connections will currently only be kept alive for a maximum of 1ms between requests to make sure consumers of this library will not be affected negatively. By using a very short timeout period, the loop does not keep running noticeably longer than without this feature.

Using a small time of 1ms made sense as a first step in order to get the feature out without bloating the whole ticket, because raising the currently set 1ms would involve a lot more test and edge cases that need to be covered.

I think we'll keep the short timeout as a default in the future, while allowing to adjust the timer to your needs, so to conclude, this is definitely on our roadmap for HTTP, also mentioned by @clue in his pull request:

In the future, we should give more control over this timeout to allow reusing connections also for common use cases of periodically polling hosts using a longer interval.

I hope this answers your questions and shows some of our plans for further development of HTTP keep-alive 👍

@frosty00
Copy link

I also agree that 1ms is too short a default and the value should at least configurable.

@pbrisson
Copy link
Author

pbrisson commented Feb 21, 2023

@WyriHaximus @SimonFrings Thanks for your answers.

I think users of the library can only benefit of larger timeouts as multiple requests being made to the same URL are just reusing the existing connection then (whereas the 100ms improvement I mentioned).

This said, it totally makes sense that you wanted to minimize the impact of that feature while it's being tested! I'm happy to help testing that feature when it's implemented. I will refrain from submitting a PR as I said, I'm aware it represents a lot of refactoring to expose the variable that will make the timeout configurable. For now I switched to a pure React\Socket implementation, which is more flexible for my use case.

@clue
Copy link
Member

clue commented Feb 25, 2023

@pbrisson Good input and I completely agree that the maximum time to keep alive idle connections should be configurable 👍

I have indeed implemented the HTTP keep alive feature with a 1ms default in #486 as a starting point only and to add more options in follow-up PRs. The 1ms default was specifically chosen as a reasonable default to get more feedback how people plan to use this feature and because the 1ms default already significantly improves common use cases like following redirects and handling queued requests as described in https://github.com/reactphp/http#concurrency without exhibiting too many side-effects.

Of course, there are plenty use cases where a higher maximum time to keep alive idle connections can be useful. For example, if you constantly poll a remote API every 3s, you're probably okay with reusing the same connection and would want to avoid the overhead of recreating the underlying connection for each request.

However, higher timeout values also introduce new problems that are next to non-existent with the current 1ms default. Idle HTTP/1.x connections may be closed by either side without notice which means there's has an inherent race condition that might happen if the server closes the connection exactly when the client sends the next request. Both sides may specify the HTTP Keep-Alive: timeout=120 header (https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Keep-Alive), but this isn't frequently used in practice and also doesn't really fix this race condition (but makes it less likely to occur because we at least have an idea when a connection is expected to be closed approximately).

On top of this, ReactPHP's event loop will keep running as long as there is something to do. This means that if we would keep alive idle connections for 60s, the following example would keep running for 1s + 1 request roundtrip + 60s.

$browser = new React\Http\Browser();

$response = React\Async\await($browser->get('https://example.com/'));
assert($response instanceof Psr\Http\Message\ResponseInterface);
var_dump($response->getStatusCode());

Loop::addTimer(1, function () use ($browser) {
    $response = React\Async\await($browser->get('https://example.com/'));
    assert($response instanceof Psr\Http\Message\ResponseInterface);
    var_dump($response->getStatusCode());
});

For long running applications such as server daemons this might not be a problem, but this clearly demonstrates why the current default of 1ms is less problematic (see clue/reactphp-redis#130 and others for similar issues). From the HTTP client perspective, this situation is rather hard to detect, as there's currently no reasonable way to know no further request will (or can) be sent.

That said, I agree that the maximum time to keep alive idle connections should be configurable to give consumers of this project more control for more specific requirements. I've looked into adding a new withKeepaliveTimeout() method (or similar) and believe this wouldn't be too much work. As pointed out above, I think this should also take the HTTP Keep-Alive: timeout=120 header into account for both the client and server side. On top of this, we may want to look into the existing class hierarchy and may potentially use class destructors to detect when no further requests will (or can) be sent (possibly using WeakReference available as of PHP 7.4+).

Long story short: This is definitely going to happen, but requires significant work. For HTTP keep-alive as implemented in #486 and referenced tickets, you're already looking at several weeks(!) of work, so I think this might easily be one of the largest feature additions in ReactPHP lately. We now have a solid base for this feature, so I'd like to thank our sponsors for making this possible!

If you'd like to support this development, please reach out or consider sponsoring ReactPHP! ❤️

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants