New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rails 6 RC2 regression: (postgres) prepared statement improperly parameterized #36763
Comments
@eileencodes any idea of what could be causing this? It seems the query cache is now using the locked thread and maybe it was that? |
Hrm, I'm not sure. Maybe we can't have the query cache in system tests 😢 cc/ @engwan as the original author of the commit that changed behavior. Do you have any ideas what could be going on here? |
Hmm, I'm not sure how the query cache could cause this problem with bind params. @97jaz Is it possible for you to give more information regarding the ActiveRecord query you're making here? Also, to isolate the problem, can you try disabling the query cache for that controller and see if the problem is still there? I see that the query cache uses a lock to synchronize access to the cache so I think it should be fine even if multiple threads use the same cache. What web server for Capybara are you using here? Puma? Can you try setting the number of Puma threads to 1 to see if the problem still persists? |
Those are questions to @97jaz . |
@engwan The query is dead simple: member = Member.find_by!(id: params[:member_id]) It should have two parameters, as a prepared statement, since it will translate into something like: SELECT members.* FROM members WHERE id = ? LIMIT ?
On the other hand:
Yes, Puma. |
Oops -- my mistake. Capybara was configured to use webrick. When I changed it to use Puma and specified max 1 thread, I could not reproduce the problem. If I bumped that up to max 5 threads, I could. (I was previously messing with the settings in |
|
Huh, okay, I monkey-patched some code in When I get the error:
That query I mentioned above (
The SQL doesn't contain any parameters, but parameter values are being passed anyway. And when I get the reverse error:
the query is being called via
I had previously thought that these error messages indicated a race between two different queries, but they don't. It's just one query, and AR is confused about whether it has parameters or not. |
I'm now nearly certain of what's causing the problem. The system test in question goes through a code path that makes explicit use of Happily, though, the reason we're explicitly using However, I think this is still a bug. |
Is the same connection being shared between threads? We usually don't allow this, but I guess it changed for system tests. |
Even before the query cache fix, system tests were already sharing the same connection. That's the main mechanism allowing us to use transactions with system tests. What was happening then was that all system test threads were sharing a connection but the query cache was enabled on a different connection. So the fix changes this so that query cache is enabled for the correct connection. I'm still not sure why query cache enabled would affect the logic around prepared statements. I think the statement cache is separate from the query cache? I'd have to take a look at the code again and check |
I believe that's the intended effect of the whole And I think I celebrated too early (regarding my ability to work around this problem). It turns out that the fix for #33702 didn't really work for cached queries, leading to #35286, which solves the problem by... using Then again, I can just configure tests to use a single server thread. So still, not difficult to work around. |
@engwan Yes, I still don't understand exactly how your commit affects this. When I wrote "However, I think this is still a bug," I meant the interaction between sharing a connection between threads and using |
Yes, that's definitely a bug. I'm curious to know why it only shows up with query cache enabled |
I now have a minimal app that reproduces the problem: |
After looking at the code involved, I think that the only way commit 853f568 is involved is that it gives the race condition more opportunity to manifest, but as far as I can see, in principle, the race exists with or without it. (Though, still, I cannot reproduce the issue before that commit, so maybe I'm wrong.)
Three of these methods:
... call |
Considering that In this case doing something like:
def unprepared_statement
old_prepared_statements, @prepared_statements = @prepared_statements, false
uncached { yield }
ensure
@prepared_statements = old_prepared_statements
end |
I'll give that a try in |
It is my understanding (might be wrong), that cache is created with Removing Here is the relevant line in which cache is created with
cache_sql(sql, name, binds) { super(sql, name, binds, preparable: preparable) } |
@khasinski Your patch to It looks like there are plenty of opportunities for the value of |
In system tests, a single database connection is shared among all the server threads. A call to unprepared_statement temporarily modifies an instance variable on the connection object, which is then visible to other concurrently running threads. This leads to a situation where prepared statements may end up with the wrong binds. Addresses rails#36763
In system tests, a single database connection is shared among all the server threads. A call to unprepared_statement temporarily modifies an instance variable on the connection object, which is then visible to other concurrently running threads. This leads to a situation where prepared statements may end up with the wrong binds. Addresses rails#36763
In system tests, a single database connection is shared among all the server threads. Since connections have per-instance mutable state, this leads to race conditions, as in rails#36763. This patch changes the connection pool to wrap shared connections so that all access to them is synchronized on the connection's monitor lock.
In system tests, a single database connection is shared among all the server threads. A call to unprepared_statement temporarily modifies an instance variable on the connection object, which is then visible to other concurrently running threads. This leads to a situation where prepared statements may end up with the wrong binds. Addresses rails#36763
Anyone have an opinion on the solution offered in #36871? |
This fixes a race condition in system tests where prepared statements can be incorrectly parameterized when multiple threads observe the mutation of the @prepared_statements instance variable on the connection. Fixes rails#36763
Per rails#36949 we introduce a race condition fix for rails#36763 This refines the fix to avoid using Concurrent::ThreadLocalVar The implementation in the concurrent lib is rather expensive, culminating in a finalizer per object that spins off a thread to do cleanup work. None of this expense is needed as we can simply implement the desired behavior using Ruby primitives. Additionally this moves to a Fiber bound implementation vs a thread bound implementation, something that is not desired for this particular usage.
Per #36949 we introduce a race condition fix for #36763 This refines the fix to avoid using Concurrent::ThreadLocalVar The implementation in the concurrent lib is rather expensive, culminating in a finalizer per object that spins off a thread to do cleanup work. None of this expense is needed as we can simply implement the desired behavior using Ruby primitives. Additionally this moves to a Fiber bound implementation vs a thread bound implementation, something that is not desired for this particular usage.
Per #36949 we introduce a race condition fix for #36763 This refines the fix to avoid using Concurrent::ThreadLocalVar The implementation in the concurrent lib is rather expensive, culminating in a finalizer per object that spins off a thread to do cleanup work. None of this expense is needed as we can simply implement the desired behavior using Ruby primitives. Additionally this moves to a Fiber bound implementation vs a thread bound implementation, something that is not desired for this particular usage.
Steps to reproduce
I haven't been able to create a minimal example that demonstrates the problem, unfortunately. (I tried.)I put together a simple app that demonstrates the problem: https://github.com/careport/racy
At root, the problem is that, in system tests, a single database connection is shared among all server threads. This interacts badly with uses of
unprepared_statement
, since the latter modifies the@prepapred_statements
instance variable on the connection. That change is visible to other threads sharing the connection. As a consequence, some queries wind up incorrectly parameterized.I do not yet understand why this problem doesn't manifest before commit 853f568. I'll continue to look into it, but it might be more obvious to someone with more knowledge of AR internals.
Old description below:
When running a particular system test in our test suite, we often (though not always) get the error:
or, less frequently, the reverse:
These errors occur at the same point in the program execution. It's an ActiveRecord query that does not use any Arel methods. (I mention this, because previous bug reports with similar PG driver errors were caused by bad interaction between ActiveRecord and Arel. That is not what is going on here.
It appears to be a race between two requests. The problem starts with commit 853f568, which merged #36618.
I should mention that we're using rspec-rails 4.0.0.beta2, so it could be bad interaction between rspec and rails. But the error really does not happen prior to that commit.
Expected behavior
The database error described above should not occur.
Actual behavior
The database error does occur.
System configuration
Rails version: 6.0.0.rc2, but really anything on 6-0-stable starting with 853f568
Ruby version:
ruby 2.6.3p62 (2019-04-16 revision 67580) [x86_64-darwin17]
(Though this also happens on our CI server, which runs linux, not OS X.)
The text was updated successfully, but these errors were encountered: