Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[HttpClient] Dns cache issue in long running process #43929

Closed
rmikalkenas opened this issue Nov 4, 2021 · 8 comments
Closed

[HttpClient] Dns cache issue in long running process #43929

rmikalkenas opened this issue Nov 4, 2021 · 8 comments

Comments

@rmikalkenas
Copy link
Contributor

rmikalkenas commented Nov 4, 2021

Symfony version(s) affected

5.3.10

Description

Recently came up with very interesting bug or I would say a side affect of symfony http client using in a long running process:

I have a long running process which consumes async envelopes and does http requests to constant host (process might run up to couple of hours). At some point of time the host which we are requesting changed its IP address and therefore my consumer process started to fail to send http requests. After a brief investigation I have found out that symfony's http client has an dns cache in its base clients (CurlHttpClient, NativeHttpClient) and it only resets on class destruction.

I do understand why this dns cache is implemented on http client, but looks like it might be a bottleneck for a long running process.

I could help with a possible fix and PR preparation, but need more insights on this issues

How to reproduce

My environment:

  • Php version: 7.4.21

  • Curl version: 7.64.0

  • Symfony version: 5.3.10

  • http_client.yaml configuration:

framework:
    http_client:
        default_options:
            retry_failed:
                max_retries: 5
  • under the hood using CurlHttpClient

  • all the services injecting http client by using Symfony\Contracts\HttpClient\HttpClientInterface typehint

  • have a long running process which does http requests to same host using symfony http client

  • DNS changes host ip

  • the same long running process's requests start to fail (due to using old ip)

Possible Solution

  • one of the solutions would be to reduce consumer lifetime so that it would restart resulting in refreshing app state, but from my perspective its just a workaround and not a future proof solution.
  • another solution would be to add a kernel.reset tag for CurlHttpClient and make sure that with every handled envelope the ServicesResetter would be called. I even tried to do it locally, but without luck.. What I tried:
    • On vendor/symfony/framework-bundle/Resources/config/http_client.php http_client definition added ->tag('kernel.reset', ['method' => 'reset']) with a hope that it would be included to ServicesResetter, but as I'm having a retry configuration (see on reproduce tab) the http_client service at some point is removed from container and replaced with decorated http_client.retryable (see screenshot in additional context) service and this http_client.retryable gets kernel.rest tag, but unfortunately it does not work as it doesnt implement ResetInterface. If I remove retry configuration, then it works fine.

Additional Context

image

@nicolas-grekas
Copy link
Member

Symfony 5.4 will provide your 2nd solution: #41163

We could also remove the entry from the DNS cache when a transport error happens at connection time. Would you like to give it a try?

@stof
Copy link
Member

stof commented Nov 4, 2021

@nicolas-grekas Symfony 5.4 will not provide that second solution, as the http_client service will not reset its DNS cache when resetting services.

And the reason why the attempt at doing the second solution does not work is because the kernel.reset tag is transferred to the decorator service.
@nicolas-grekas given that resetting the service generally relies on calling a method that is not part of the main interface of the service, but rather an internal implementation detail of that implementation, I think it the kernel.reset tag may need the special treatment too in

// container.service_locator and container.service_subscriber have special logic and they must not be transferred out to decorators
foreach (['container.service_locator', 'container.service_subscriber'] as $containerTag) {

@nicolas-grekas
Copy link
Member

Won't #43333 still forward the call to reset?
Anyway, I agree we should add kernel.reset to that list.
It also looks like NativeHttpClient is missing a way to reset its dnscache.
PR welcome to fix both issues.

@rmikalkenas
Copy link
Contributor Author

@nicolas-grekas I can help with PR, which branch should I target?

@nicolas-grekas
Copy link
Member

I'd say 5.4 please

@nicolas-grekas
Copy link
Member

Actually, NativeHttpClient might no need a reset: its dnscache is emptied as soon as all requests completed already.
We could still remove the entry from the DNS cache when a transport error happens, for all clients.

@rmikalkenas
Copy link
Contributor Author

rmikalkenas commented Nov 6, 2021

As I mentioned in my first message, I'm using a CurlHttpClient and not NativeHttpClient. After spending some time to understand how CurlHttpClient works I came up with few insights:

  1. if you are not passing $options['resolve'] then there is no way for $this->multi->dnsCache to be populated in CurlHttpClient. That means that my issue with wrongly resolved IP address has nothing to do with CurlHttpClient, instead it might be some kind of webserver dns cache (hardly to believe, I'm using kubernetes infrastructure and there are no dns cache in it..) or some kind of cache which is used by php curl extension or curl itself (even though I see that curl option CURLOPT_DNS_USE_GLOBAL_CACHE is set with false value, meaning to disable dns cache).
  2. I dont even think that wiping dns cache on transport exception will help.. As my situation is quite interesting: the service changed its IP address, but the old IP address remained valid with all requests to respond with 500 status code. Which means that http client thrown server exception and not transport. You may ask how do I know if the IP is wrong? I looked at http calls log and found out that some of thems primary_ip info field contains old IP address (I guess this field is populated by php curl extension)
  3. Its quite complex to reproduce such behaviour, but my guessing is that on long running processes after every request CurlClientState::$handle has to be closed and reinit

Totally different problem which I found is that kernel.reset and ResetInterface is not working on services which are decorated. What I have found out and tried:

  • added kernel.reset to Symfony\Component\DependencyInjection\Compiler\DecoratorServicePass to not pass this tag to decorator service. But after recompiling container and checking ServicesResetter noticed that resetable service is not in a $resettableServices iterator.
  • also noticed that DecoratorServicePass is renaming a service which is going to be decorated and adding an alias with its original name. But here comes the problem: if original service is private, then the alias is also going to be private as well which means that during container compilation RemovePrivateAliasesPass is going to remove such alias.
  • after some digging found out, that ResettableServicePass is added to compiler with a type TYPE_BEFORE_OPTIMIZATION, while DecoratorServicePass is added as TYPE_OPTIMIZE. Taking into consideration my previous bullet, we have a service with a kernel.reset tag added to ServicesResetter with a reference that has service id which is not in a container anymore, leading to invalid reference.
  • I also tried to make this alias as public, but without luck. Even though container has this alias, but ServicesResetter $resettableServices iterator is missing such service.

cc @nicolas-grekas maybe you will have more insights or corrections if I'm wrong at some point

@rmikalkenas
Copy link
Contributor Author

@nicolas-grekas I managed to reproduce an issue with wrongly resolved primary_ip: https://github.com/rmikalkenas/dns-cache-issue

Take a look at https://github.com/rmikalkenas/dns-cache-issue/blob/master/src/Command/HttpClientCommand.php
The second request should fail and primary_ip should be the one added to hosts file, but it remains the same.

Tried to add \CURLOPT_DNS_CACHE_TIMEOUT => 1, but it does not solve an issue. The only way that helped is to add

curl_multi_close($this->handle);
$this->handle = curl_multi_init();

to reset method and explicitly call it after first request

nicolas-grekas added a commit that referenced this issue Nov 8, 2021
…dle on reset (rmikalkenas)

This PR was submitted for the 5.4 branch but it was merged into the 4.4 branch instead.

Discussion
----------

[HttpClient] Curl http client has to reinit curl multi handle on reset

| Q             | A
| ------------- | ---
| Branch?       | 4.4
| Bug fix?      | yes
| New feature?  | no
| Deprecations? | no
| Tickets       | #43929
| License       | MIT

All the details are described in #43929

Commits
-------

dc4380f [HttpClient] Curl http client has to reinit curl multi handle on reset
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants