New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cancel before rollback may cause wrong transaction to be canceled with pgbouncer #430
Comments
This is probably a pgbouncer bug. Ruby-pg sends the cancel request on a second connection, but it waits until the request has been processed by the PostgreSQL server. Not till then the ROLLBACK command is executed. From reading the code I would estimate, that pgbouncer doesn't process the cancel request before it indicates it's completion. Therefore the next commands are already processed before the cancel request has been dispatched. I'll try to reproduce this issue with a docker image for pgbouncer. |
Thank you for the quick response. There is a response on the pgbouncer issue as well with a possible explanation: pgbouncer/pgbouncer#684 (comment) |
This avoids sending a cancel request if there is no active query running. In case of a failing SQL statement, the transaction_status is PQTRANS_INERROR. The previous code sent a cancel request in this case although the query is known to be aborted. In case of ruby code that raised an error, the transaction_status is PQTRANS_INTRANS. Also in this case there is no use of sending a cancel request. The cancellation of queries in case of exceptions was introduced by ged#391 . Now we cancel more conservative, only in case of a running query. The cancellation can cause issues with pgbouncer, which releases a connection after a SQL error was raised. It then dispatched the cancel to the next SQL command. This change should solve this incompatibility. Fixes ged#430
I think it makes sense to cancel a query only in case it is active. This is implemented in #431. |
Wow, that was fast. I’ll test with my repro shortly. |
One question, is there still a very small chance that the transaction aborts immediately after the check? Maybe that’s a small enough window to not be a problem, but any action based on a query is theoretically subject to race conditions. |
I can confirm that I can no longer reproduce the issue I was seeing with #431. |
In case of a single exception within the query, as described in this issue, I don't think so. Pgbouncer probably releases the connection and forwards the error to the client application. But now, that #431 is merged, the client application doesn't send a cancel after it received the message. So there is no issue. However if there are two simultaneous errors, one in the client and one in the query, then such a race can happen, I think. If the client cancels the query after the connection is already released by pgbouncer, but the error message wasn't sent to the client yet, the cancel can possibly be passed to another query command. |
Thank you again @larskanis the only thing I can think of for the extant race condition would be to allow disabling the cancel. It’s such a narrow case that I’m not worried about it personally though. |
Re: d2451b1 and #390
If a cancel happens due to an error it is possible that the cancel will take effect on the next transaction that is run, rather than the current.
See: pgbouncer/pgbouncer#684
It's unclear to me if this is a pgbouncer bug or not. I cannot reproduce it when connecting directly to pg, so I would imagine so, but since this is a recent change to pg I wanted to report it here for visibility.
I don't have a shareable repro narrowed yet, but this is how I can reproduce it locally:
One of the MessageStore::Postgres::Get's will occasionally fail with:
The text was updated successfully, but these errors were encountered: