Handling of idle timeout (aws nat gateway fixed idle timeout of 350 seconds) #3184

ecerulm · 2023-11-08T07:40:56Z

Context

using a connection pool to a given host where the connection are going through an AWS NAT Gateway which has a fixed idle timeout of 350 seconds.

The issue is that when a new request is performed after 350 seconds, the connection will error with "Connection reset by peer" (nat gateway will send an explicit RST when trying to send to that connection after the idle timeout) , then it will retry but the next connection in the pool will also be "reset by peer" (it was idle too).

I would be nice to have idle_timeout=349 parameter for PoolManager/ConnectionPool, so that urllib3 will check if the connection has been idle for more than that and discard it before use it.

I think today (from comment

When a connection is pulled from the pool it's checked to see if the underlying socket is still open and if not the connection is reopened.

The problem with that is the the underlying socket is still open (you will only get the RST from aws nat gateway) after you try to use it.

So my proposal is is to in addition to check if the underlying socket is still open, to do also a check if the idle_timeout has passed, if it has pass reopen the connection.

Is it something you currently cannot do?
well, I guess , I can catch the exception (after all the retries) , and drop the whole poolmanager/connectionpool.

Is this related to an existing issue/problem?

Yes, connections via nat gateway have implicit idle timeout and connection are not explicitly reset by nat gateway, they are only resetted when you try to used them. That plays bad with current urllib behavior because it will retry but all connections for that host are "tainted", so it should not try to use them .

Alternatives

Can you achieve the same result doing it in an alternative way?
Is the alternative considerable?

I guess that will require some error handling and drop the poolmanager/connectionpool and create a new one when this is detected.

Duplicate

Has the feature been requested before?
If so, please provide a link to the issue.

#2498 was dismissed on the grounds that it will require a background thread, but I think there is no need for that. This could be done by checking if the idle_timeout has passed when the connection is pulled from the pool and reopening the connection if the timeout has passed. I guess #2498 was calling for a more explicitly closes the connection after the idle_timeout, but I'm asking more for a "consider idle connection dead after x seconds).

Contribution

Would you be willing to submit a PR?
(Help can be provided if you need assistance submitting a PR)

The text was updated successfully, but these errors were encountered:

sethmlarson · 2024-01-06T19:15:27Z

I could see a feature like this being useful, especially how you describe it with checking a "last known good" time against the idle_timeout on the connection pool. Would cut down on the number of unnecessary retries happening as well.

What do other maintainers think about this feature? cc @pquentin @illia-v If we think it's useful we can add a bounty to this one.

illia-v · 2024-01-07T15:47:32Z

I agree that this can be useful

ecerulm · 2024-01-08T09:38:09Z

I think I can create a PR for this feature, that would be my first contribution to this project, so I'm not sure about the process, can you assign the issue to me?

sethmlarson added the Contributor Friendly ♥ label Jan 6, 2024

sethmlarson assigned ecerulm Jan 8, 2024

ecerulm linked a pull request Jan 10, 2024 that will close this issue

Idle timeout #3275

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Handling of idle timeout (aws nat gateway fixed idle timeout of 350 seconds) #3184

Handling of idle timeout (aws nat gateway fixed idle timeout of 350 seconds) #3184

ecerulm commented Nov 8, 2023

sethmlarson commented Jan 6, 2024 •

edited

illia-v commented Jan 7, 2024

ecerulm commented Jan 8, 2024

Handling of idle timeout (aws nat gateway fixed idle timeout of 350 seconds) #3184

Handling of idle timeout (aws nat gateway fixed idle timeout of 350 seconds) #3184

Comments

ecerulm commented Nov 8, 2023

Context

Alternatives

Duplicate

Contribution

sethmlarson commented Jan 6, 2024 • edited

illia-v commented Jan 7, 2024

ecerulm commented Jan 8, 2024

sethmlarson commented Jan 6, 2024 •

edited