Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

max_connections is not respected and connections pool grows until connections are closed by idle timeout #648

Open
DmitryKakurin opened this issue Aug 14, 2020 · 18 comments
Assignees

Comments

@DmitryKakurin
Copy link

DmitryKakurin commented Aug 14, 2020

Versions:

Hackney: 1.16.0
HttPoison: 1.7.0
Elixir/Erlang:

Erlang/OTP 21 [erts-10.3] [64-bit] [smp:8:8] [ds:8:8:10] [async-threads:1]
Elixir 1.10.3 (compiled with Erlang/OTP 21)

Simplest Repro:

Set max pool size to 1 and hit 2 different servers.

Interactive Elixir (1.10.3) - press Ctrl+C to exit (type h() ENTER for help)
iex(1)> :hackney_pool.start_pool(:default, max_connections: 1)
:ok

iex(2)> HTTPoison.get("example.com")
{:ok,  %HTTPoison.Response{iex(3)> HTTPoison.get("example.net")
{:ok,  %HTTPoison.Response{iex(4)> :hackney_pool.get_stats(:default)
[name: :default, max: 1, in_use_count: 0, free_count: 2, queue_count: 0]

free_countis is 2, but it should not be greater than max, which is 1.

Suspected bug location

I suspect the bug was introduced in this checkin: ec5b90c
Note how the check if PoolSize =< MaxConn was not carried over from handle_call({checkin,... into deliver_socket function for the empty case.

The new check dict:size(Clients) >= MaxConn in handle_call({checkout... ensures that no more than MaxConn clients will be waiting for connection at the same time, but is not sufficient to limit the total pool size.

@benoitc
Copy link
Owner

benoitc commented Sep 11, 2020

Does the requests return to the pool? What dis the Response thing returned ? Does it contains the body?

@DmitryKakurin
Copy link
Author

@benoitc the code repro above is literally runnable (example.com and .net are real existing domains). Both requests return 200.
Given free_count: 2 I assume yes, connections are returned to the pool.

@jdppettit
Copy link

I'm running into this problem as well - any updates on where things are on this?

@benoitc
Copy link
Owner

benoitc commented Oct 22, 2020 via email

@sorliem
Copy link

sorliem commented Dec 16, 2020

I was bit pretty severely by this bug a couple days ago. Had to disable pooling entirely. Is there a release coming coming out that contains the fix?

@benoitc
Copy link
Owner

benoitc commented Dec 16, 2020

I was bit pretty severely by this bug a couple days ago. Had to disable pooling entirely. Is there a release coming coming out that contains the fix?

What happened. How did you reproduce the issue?

There is a relase that will land today that is changing the way connections are handled, but the problem above shouldn't trigger anything severe if connections are released correctly to the pool.

@sorliem
Copy link

sorliem commented Dec 17, 2020

I have code in my system using hackney that calls 3rd party services to get data. I saw the :checkout_timeout error explode, added some metrics to monitor the in_use value returned from :hackney_pool.get_stats(:default) and redeployed.

The following screenshot is a new k8s deploy of 10 pods each with 1 default hackney pool running:
2020-12-11-hackney_pool_3
The lines going steadily up is each pod reporting the in_use connections and all are rising to the 200 max connections I configured. Shortly after this time the checkout_timeout errors started showing up.

@benoitc
Copy link
Owner

benoitc commented Dec 17, 2020

I have code in my system using hackney that calls 3rd party services to get data. I saw the :checkout_timeout error explode, added some metrics to monitor the in_use value returned from :hackney_pool.get_stats(:default) and redeployed.

The following screenshot is that new k8s deploy of 10 pods each with 1 default hackney pool running:
2020-12-11-hackney_pool_3
The lines going steadily up is each pod reporting the in_use connections and all are rising to the 200 max connections I configured. Shortly after this time the connect_timeout errors started showing up.

does your code always ensure to read the body? That said I am working on replacing the pool. The code will be ready for consumption tomorrow. Took more time than expected...

@dbhobbs
Copy link

dbhobbs commented Dec 17, 2020

Is reading the body required? I don't see that mentioned in any of the documentation. 🤔

@benoitc
Copy link
Owner

benoitc commented Dec 17, 2020

Is reading the body required? I don't see that mentioned in any of the documentation. 🤔

Yes to release the socket you need either read the body (which is always done when using the with_body option in the request, or skip it using the skip_body function or similar. Closing the request does also work. If a response is not read completely and the process is still active hackney is actually consider it as an active session for now.

@benoitc
Copy link
Owner

benoitc commented Dec 19, 2020

that should indeed be mentionned by the doc ...

@sorliem
Copy link

sorliem commented Dec 19, 2020

Oh, ok. We may have some locations where we are not reading the body. Would you be able to point me to the right location in the documentation that says that?

@benoitc
Copy link
Owner

benoitc commented Dec 26, 2020

Oh, ok. We may have some locations where we are not reading the body. Would you be able to point me to the right location in the documentation that says that?

that's not written explicitly in the doc i think. A connection is removed from the pool automatically if the process is closed before releasing it or if the whole body has been read, otherwise (logically I would say) there is no way to know if the connection has to be released.

@alexgleason
Copy link

https://github.com/benoitc/hackney/blob/master/NEWS.md

1.17.0

fix memory leak in connection pool

Is this fixed now?

We downgraded hackney to 1.15.2 and wondering if it's safe to upgrade again: https://git.pleroma.social/pleroma/pleroma/-/issues/2101

@benoitc
Copy link
Owner

benoitc commented Apr 30, 2021 via email

@kuznetskikh
Copy link

kuznetskikh commented Nov 20, 2021

Hello @benoitc!
Any update on this issue?

I'm facing with it on httpoison 1.8.0 and hackney 1.18.0.
Hackney still doesn't respect max_connections as in @DmitryKakurin's example.

@benoitc
Copy link
Owner

benoitc commented Nov 22, 2021

@kuznetskikh does httpoison read the body? Body need to be read or skipped to release the connection.

@kuznetskikh
Copy link

Hello @benoitc. I'm sorry for late response, it's working fine for me now.
I just incorrectly configured hackney pool on application start. Now all is fine, max_connections is taken into play using:

:hackney_pool.child_spec(:main_api_pool, timeout: 15_000, max_connections: 1)

And only 1 connection is available.

Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants