New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix for case where db_pool connections can be lost. #527
base: master
Are you sure you want to change the base?
Fix for case where db_pool connections can be lost. #527
Conversation
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## master #527 +/- ##
=====================================
Coverage 53% 53%
=====================================
Files 88 88
Lines 9881 9880 -1
Branches 1852 1853 +1
=====================================
Hits 5335 5335
+ Misses 4156 4155 -1
Partials 390 390
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
@erickj00001 |
It works fine for me in Python 3. Anyway, it raises the exception again immediately (after decrementing a counter). It's just important that the counter get decremented there. The base class version of get() uses "except:", like I said. Why do you suppose it's correct there, but not in the derived class version? |
I said nothing about |
Please help me understand here. Are you saying that there isn't a problem with the code as is? I believe my test script adequately demonstrates that there is a problem, but if you disagree then please let me know your reasoning. Or, are you just saying that you don't like the syntax of the proposed solution? We can change it to use finally: instead if you prefer, something like this: try:
conn = self.create()
finally:
if conn is None:
# unconditionally increase the free pool because
# even if there are waiters, doing a full put
# would incur a greenlib switch and thus lose the
# exception stack
self.current_size -= 1 You mention that you want the test script checked in as part of the patch. Do I understand that to mean you are not able to try it as presented, and you will only run it if I integrate it into the unit test framework? Again, this is not to complain -- I just wish to understand the requirements. |
Clearly there are problems with current code, arguably the biggest one is Sorry for ambiguity. My points are:
|
The problem occurs if: (1) A greenthread was waiting for a connection from the pool (2) Another thread calls put() to return a connection to the pool, but it has expired, so the first thread will be cleared to create a new connection (3) The create() method raises an exception that doesn't inherit from Exception, such as eventlet.Timeout Note that eventlet.pools.Pool.get uses "except:" instead of "except: Exception" to make sure all exceptions from create() are accounted for, but eventlet.db_pool.get() incorrectly uses "except: Exception" in the case where a returned connection was found to be unusable.
c3498dc
to
b069731
Compare
Ok, I've added my script as a unit test (as part of the existing db_pool_test.py). I reduced the sleep delays to the minimum, so as not to slow down the testing unnecessarily. I've also updated the proposed fix so it uses finally: instead of except:. |
I've encountered a case where eventlet.db_pool can lose a connection even though it was returned using put(), and after that, all greenthreads will wait indefinitely for a connection (assuming max_size=1, or if it happens enough times).
The problem occurs when:
(1) A greenthread was waiting for a connection from the pool
(2) Another thread calls put() to return a connection to the pool, but it has expired, so the first thread will be allowed to create a new connection
(3) The create() method raises an exception that doesn't inherit from Exception, such as eventlet.Timeout
Note that Pool.get uses "except:" instead of "except: Exception" to make sure all exceptions from create() are accounted for, but BaseConnectionPool.get incorrectly uses "except: Exception" in the case where a returned connection was found to be unusable.
Here's a standalone test that demonstrates the problem, with comments that explain in more detail:
https://github.com/erickj00001/test-scripts/blob/c1f041b4428b01cf62c6fb68718d5d30f95224ed/db_pool_test.py#L1-L108
Output:
After changing the "except Exception:" to "except:" in BaseConnectionPool.get, the test passes: