buffer: Use a double-buffering scheme to prevent data races #943

vmg · 2019-04-01T15:52:43Z

Description

This is a potential fix for #902, #903. Some context:

The Go 1.10 release changed the API contract for driver.Rows.Next in sql/driver. Previously, we could modify the backing storage behind the []Value slice that is passed to our driver's driver.Rows.Next function when the user calls the sql.Rows.Close method from the public interface. This is no longer the case: we can no longer modify this backing storage outside of the driver.Rows.Next call, or this will lead to a data race.

Although in theory, sql.Rows.Scan and sql.Rows.Close are always called sequentially from user's code, there is a corner case with context cancellation: if you've performed a query with QueryContext and your Context is cancelled or timeouts while reading the Rows result of that query, Rows.Close can be called simultaneously with sql.Rows.Scan, which is where this data race happens.

Now, why are we modifying the backing storage we return from driver.Rows.Next? This is a complex issue arising from a design constraint in MySQL's protocol. When the Go sql package cancels an on-flight query, before this query's connection can be re-used to perform any more queries, we must "drain" it, i.e. read all the remaining query results until an EOF packet that may have already been sent from MySQL. This draining is a complex operation, because although most of the drained data is discarded, we still need to read packet headers to figure out the size of each incoming packet, and we still have to read the EOF (and any potential ERR) packets whole, so we can keep track of our connection status.

This means that draining a connection after a query writes data to our connection buffer, by necessity. When the db/sql package is performing Scan on the fields we've returned from driver.Rows.Next, these fields are often backed by our connection buffer, so any writes to it will corrupt the Scan values. This is a serious vulnerability that needs to be addressed asap.

I have spent some time evaluating all possible fixes for this issue, and I believe the fix proposed here has the best ratio between complexity and extra memory usage. Let us review the other options I've discarded:

Truncate & allocate a new buffer during the Close call: this has been implemented by @methane in Truncate internal buffer when rows.Close() is called #904 and rejected by @julienschmidt. I do agree with Julien that allocating a new buffer per close is too expensive. It can cause significant memory churn in the library.
Delay the discard operation on the connection: since we do not know how long after a Rows.Close call is it safe to start reading from the connection and writing to our buffer to "drain" the remaining rows, it could in theory be possible to simply "flag" the connection with a "needs draining" flag on Close, and then performing the draining on the next Query. I don't think this is a good idea because it could add arbitrary latency from a previous query to the following one, and because it's a bit scary to leave MySQL hanging with packets; it could lead to connection timeouts.
Discard the remaining packets without writing to the connection buffer: I have sketched an implementation of this here. I am wary of this approach because discarding packets straight from the wire introduces a lot of extra complexity: it is not possible to perform the discarding without using some scratch storage to read EOF and ERR packets, so we need to allocate a small discard buffer for each connection (64 bytes). The code to switch between the discard and packet-read buffers is complex. Although the resulting performance is very good (better than before the fix, because we're discarding large packets directly instead of allocating and returning them), and the fix does not allocate extra memory, I believe the complexity trade-off is not worth it.
Use a double-buffering scheme: this is the solution proposed in this PR. It's far from a novel idea. Double-buffering has been used for decades in graphic buffers to, huh, avoid data races! It consists on allocating two buffers and switching between them once "data" has been returned to the user (in this case, "data" is the contents of the Rows to scan; in Graphics "data" is a rendered frame). There is an obvious upfront memory cost, which is that each connection requires two connection buffers (so an overhead of 4kb per connection), but once we've allocated the foreground and background buffers on a new connection, the performance is great, there are no further memory allocations, and the code to swap between the buffers is minimal.

@julienschmidt: I know you're wary of increasing memory usage in the library but I'd urge you to review this PR. The included stress test (TestContextCancelQueryWhileScan) proves there's a very real race here that can lead to data corruption, and the overhead cost for this fix is amortized throughout the lifetime of each connection. As explained, the code in this branch performs the discarding with allocating significantly less memory, but I don't think the complexity trade-off is worth it.

@methane: Please do review this PR. I hope you'll agree this is a good compromise for a simple but relatively efficient fix for this serious issue.

Checklist

Code compiles correctly
Created tests which fail without the change (if possible)
All tests passing
Extended the README / documentation, if necessary
Added myself / the copyright holder to the AUTHORS file

methane · 2019-04-02T04:18:41Z

Although in theory, Rows.Next and Rows.Close are always called sequentially from user's code, there is a corner case with context cancellation: if you've performed a query with QueryContext and your Context is cancelled or timeouts while reading the Rows result of that query, Rows.Close can be called simultaneously with Rows.Next, which is where this data race happens.

You may not understand the golang/go@651ddbd correctly.

driver.Rows.Next() and driver.Rows.Close() won't be called simultaneously. It is not the issue.

Rows.Scan() will return []byte which refers our receive buffer. User must parse or copy it before calling Rows.Next() or Rows.Close(). When Rows.Next() or Rows.Close() is called, we were able to reuse receive buffer.

But from Go 1.10 database/sql call driver.Rows.Close() for context cancellation. It will be called before user calls Rows.Next() or Rows.Close(). So we must not re-use buffer when driver.Rows.Close() is called.

driver_test.go

vmg · 2019-04-02T08:33:54Z

@methane: Apologies for the confusion, I've written Rows.Next when I should have written Rows.Scan in my issue description. I have fixed the description and differentiated between driver.Rows and sql.Rows, which should make it clearer.

Let me know if this makes more sense to you.

vmg · 2019-04-02T14:00:38Z

@methane: I've updated the test as you requested. Could we now please discuss the proposed fix in this PR?

driver_test.go

vmg · 2019-04-03T07:45:32Z

@methane: I've reduced the size of the buffers in the test and updated the comments. 😄

buffer.go

vmg · 2019-04-03T11:32:48Z

@methane: I added the maxCachedBufSize and ran go fmt on your changes, which was failing the build.

renthraysk · 2019-04-03T15:50:13Z

Can this be more selective on when flip() is called? As mysqlRows.Close() knows when it has to discard/corrupt the buffer. Then it seems would not have to allocate the bg buffer in newBuffer(), as fill() will allocate one if needed?

vmg · 2019-04-03T16:11:40Z

@renthraysk: We are already being quite selective on when flip() is called!

Because of the way the db/sql package works, we will not call (*buffer).flip() on the happy path of a Query operation, e.g.:

		rows, err := db.Query("...")
		if err != nil {
			panic(err)
		}
		for rows.Next() {
			err := rows.Scan(...)
			// ...
		}
		rows.Close()

In this case, when we call rows.Next() until it returns false, the last calls to rows.Next() will mark the rows as "done", so calling rows.Close() afterwards will be a no-op. We will not call (*buffer).flip() in this case because rows.mc will be nil by then.

Now, besides that, it is true that we don't have to allocate the background buffer when we create the buffer object, and in that case, it's very likely we'll never get to allocate one as long as a db.Query never fails. Let me try to do that and see how/if it affects the benchmarks.

vmg · 2019-04-03T16:23:52Z

@methane: As @renthraysk points out, we don't need to allocate the background buffer upfront. I've removed the allocation, so in the usual case of all Query calls resolving cleanly, we will never have to allocate a bg buffer for a connection and it will have no memory overhead compared to master.

renthraysk · 2019-04-03T16:47:56Z

@vmg Ah, was just looking at the diff of Close() with just the flip() call showing, without the early exits. Doh!

methane · 2019-04-04T02:29:18Z

@vmg Please don't use rebase so much. It's difficult to see what's changed after I saw last.
We use "Squash and merge" flow. No need to keep commit log simple...

vmg · 2019-04-04T07:33:45Z

@methane Understood. Sorry about that. I'll push new commits instead of rebasing.

vmg · 2019-04-04T07:45:00Z

@methane: I've added your benchmark to this PR and increased the maximum buffer size. With a max buffer of 256k, and the optimization where we don't allocate the background buffer unless needed, no extra memory is allocated during the benchmark:

benchmark                                 old allocs     new allocs     delta
BenchmarkQueryRawBytes/size=100-16        223            223            +0.00%
BenchmarkQueryRawBytes/size=1000-16       223            223            +0.00%
BenchmarkQueryRawBytes/size=2000-16       223            223            +0.00%
BenchmarkQueryRawBytes/size=4000-16       223            223            +0.00%
BenchmarkQueryRawBytes/size=8000-16       223            223            +0.00%
BenchmarkQueryRawBytes/size=12000-16      223            223            +0.00%
BenchmarkQueryRawBytes/size=16000-16      223            223            +0.00%
BenchmarkQueryRawBytes/size=32000-16      223            223            +0.00%
BenchmarkQueryRawBytes/size=64000-16      223            223            +0.00%
BenchmarkQueryRawBytes/size=256000-16     223            223            +0.00%

benchmark                                 old bytes     new bytes     delta
BenchmarkQueryRawBytes/size=100-16        7083          7082          -0.01%
BenchmarkQueryRawBytes/size=1000-16       7082          7080          -0.03%
BenchmarkQueryRawBytes/size=2000-16       7086          7080          -0.08%
BenchmarkQueryRawBytes/size=4000-16       7080          7088          +0.11%
BenchmarkQueryRawBytes/size=8000-16       7104          7103          -0.01%
BenchmarkQueryRawBytes/size=12000-16      7111          7111          +0.00%
BenchmarkQueryRawBytes/size=16000-16      7120          7119          -0.01%
BenchmarkQueryRawBytes/size=32000-16      7208          7152          -0.78%
BenchmarkQueryRawBytes/size=64000-16      7315          7315          +0.00%
BenchmarkQueryRawBytes/size=256000-16     9785          9787          +0.02%

I'm not sure of what would be the ideal value for maxBufferSize. Maybe 256k is too large for the average application. 32k or 64k would be better I think.

buffer.go

Co-Authored-By: vmg <vicent@github.com>

vmg · 2019-04-04T10:10:46Z

Awesome. I think this is a very good, simple fix and we've proven it doesn't regress performance.

Who else needs to approve this PR? 😄

methane · 2019-04-04T10:19:36Z

Who else needs to approve this PR? 😄

In our workflow, only one core member review is required. When core member creates PR, they
are not allowed to merge the PR only by self approval.

vmg · 2019-04-04T10:20:48Z

🙇🙇🙇

julienschmidt · 2019-04-15T21:28:58Z

@vmg Thanks a lot for this pull-request!

vmg mentioned this pull request Apr 1, 2019

rows.Close() must not overwrite buffer #903

Closed

vmg force-pushed the vmg/902test branch from edbf587 to 5cc6358 Compare April 1, 2019 16:06

methane reviewed Apr 2, 2019

View reviewed changes

driver_test.go Outdated Show resolved Hide resolved

vmg force-pushed the vmg/902test branch from 5cc6358 to 44b879b Compare April 2, 2019 08:23

vmg force-pushed the vmg/902test branch from 7c0dbad to 5040d21 Compare April 2, 2019 13:56

vmg force-pushed the vmg/902test branch from 5040d21 to 8eb7523 Compare April 2, 2019 14:15

methane reviewed Apr 3, 2019

View reviewed changes

driver_test.go Outdated Show resolved Hide resolved

methane reviewed Apr 3, 2019

View reviewed changes

driver_test.go Outdated Show resolved Hide resolved

methane reviewed Apr 3, 2019

View reviewed changes

driver_test.go Outdated Show resolved Hide resolved

vmg force-pushed the vmg/902test branch from 8eb7523 to 87fd8cd Compare April 3, 2019 07:45

vmg force-pushed the vmg/902test branch from 87fd8cd to 1e5f6c7 Compare April 3, 2019 07:49

methane reviewed Apr 3, 2019

View reviewed changes

buffer.go Show resolved Hide resolved

test: Implement TestContextCancelQueryWhileScan

6e2f60e

vmg force-pushed the vmg/902test branch from 1d40148 to c81b5c7 Compare April 3, 2019 11:30

methane mentioned this pull request Apr 3, 2019

add benchmark for sql.RawBytes #945

Closed

5 tasks

buffer: Use a double-buffering scheme to prevent data races

505cdc7

vmg force-pushed the vmg/902test branch from b83a94a to 505cdc7 Compare April 3, 2019 16:30

methane added 2 commits April 4, 2019 09:36

add benchmark for sql.RawBytes

e4d8c63

don't reuse connection over subbenchs

e1c9555

buffer: Increase maxCachedBufferSize

84ff6e3

methane requested changes Apr 4, 2019

View reviewed changes

buffer.go Outdated Show resolved Hide resolved

buffer: do not use superfluous 0

8af6997

Co-Authored-By: vmg <vicent@github.com>

methane approved these changes Apr 4, 2019

View reviewed changes

methane merged commit df597a2 into go-sql-driver:master Apr 4, 2019

methane mentioned this pull request Apr 4, 2019

v1.5.0 #948

Closed

julienschmidt added this to the v1.5.0 milestone Jan 3, 2020

anukul mentioned this pull request Jul 31, 2020

Race condition having to do with search and SQL sourcegraph/sourcegraph#5329

Closed

renthraysk mentioned this pull request May 19, 2023

database/sql: sql.RawBytes is not compatible with contexts golang/go#60304

Closed

methane mentioned this pull request May 24, 2023

Revert double buffering. #1435

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

buffer: Use a double-buffering scheme to prevent data races #943

buffer: Use a double-buffering scheme to prevent data races #943

vmg commented Apr 1, 2019 •

edited

methane commented Apr 2, 2019

vmg commented Apr 2, 2019

vmg commented Apr 2, 2019

vmg commented Apr 3, 2019

vmg commented Apr 3, 2019

renthraysk commented Apr 3, 2019

vmg commented Apr 3, 2019

vmg commented Apr 3, 2019 •

edited

renthraysk commented Apr 3, 2019

methane commented Apr 4, 2019

vmg commented Apr 4, 2019

vmg commented Apr 4, 2019

vmg commented Apr 4, 2019

methane commented Apr 4, 2019

vmg commented Apr 4, 2019

julienschmidt commented Apr 15, 2019

buffer: Use a double-buffering scheme to prevent data races #943

buffer: Use a double-buffering scheme to prevent data races #943

Conversation

vmg commented Apr 1, 2019 • edited

Description

Checklist

methane commented Apr 2, 2019

vmg commented Apr 2, 2019

vmg commented Apr 2, 2019

vmg commented Apr 3, 2019

vmg commented Apr 3, 2019

renthraysk commented Apr 3, 2019

vmg commented Apr 3, 2019

vmg commented Apr 3, 2019 • edited

renthraysk commented Apr 3, 2019

methane commented Apr 4, 2019

vmg commented Apr 4, 2019

vmg commented Apr 4, 2019

vmg commented Apr 4, 2019

methane commented Apr 4, 2019

vmg commented Apr 4, 2019

julienschmidt commented Apr 15, 2019

vmg commented Apr 1, 2019 •

edited

vmg commented Apr 3, 2019 •

edited