Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Ordering on WordLock (tsan detected) #292

Merged
merged 2 commits into from Aug 9, 2021

Conversation

thomcc
Copy link
Contributor

@thomcc thomcc commented Aug 7, 2021

I was writing some hand-rolled locks for a tutorial, and did some lightweight fuzzing of the locks under thread-sanitizer to check correctness. I used parking_lot::RawMutex as my "control group" to sanity-check my test suite, but it ended up detecting data races in WordLock.

Taking a look, the Orderings seem a bit too loose — lock_slow used Release in the cmpxchg_weak but needs at least Acquire semantics to acquire the lock (although just using Acquire did not appease TSan), and unlock_slow had a similar problem but reversed (although that code more obviously needs at least Acquire, and so makes sense that it needs AcqRel).

That said, I'm definitely not sure these are actually minimal orderings, but I wasn't able to reduce any individual ordering further without upsetting TSan. That said, there may also be a cleverer approach to this.


This is the code that detected the issue. I haven't minimized it, although its still pretty small. It uses cobb to perform the thread fuzzing, but cobb is all safe code (aside from a single function that isn't called in this test, and just does volatile reads/writes that are clearly sound) https://gist.github.com/thomcc/6ca2bef4b7526caeda06f4cceaabfe70

@@ -154,7 +154,7 @@ impl WordLock {
if let Err(x) = self.state.compare_exchange_weak(
state,
state.with_queue_head(thread_data),
Ordering::Release,
Ordering::AcqRel,
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch! Thinking about it for a bit, an acquire is indeed needed here so that the call to park happens after the thread is inserted into the queue.

@@ -189,7 +189,7 @@ impl WordLock {
match self.state.compare_exchange_weak(
state,
state | QUEUE_LOCKED_BIT,
Ordering::Acquire,
Ordering::AcqRel,
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This (and the changes below) doesn't make sense: there's nothing to release. The lock itself was already released in unlock with release ordering.

However I am noticing that the state load at the start of this function should be Acquire since it needs to happen before reading the state entries. Perhaps fixing this makes tsan happy?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there's nothing to release. The lock itself was already released in unlock with release ordering.

Ah, my bad, yeah I see, I didn't look that closely, and thought this was still trying to unlock not to wake up threads 😅

However I am noticing that the state load at the start of this function should be Acquire since it needs to happen before reading the state entries.

So, this doesn't make sense to me, since it enters a loop that can only be exited by either returning (in which case the data in behind the ptr inside state is not read) or by passing the cmpxchg, which gives the Acquire.

... And in fact, yeah, I was mistaken that this needed AcqRel — this one was actually fine as Acquire. (Sorry...)

Actually, I think unlock_slow might be entirely fine, and lock_slow is the only problem. It looks like the other two changes in that are probably caused by TSan not understanding how fences work in all cases (I had heard this was fixed, but it seems it is didn't get merged into LLVM yet, and has stalled. Although, in June there was some progress, , so maybe someday soon).

Regardless, if I apply the normal workaround for acquire fences under TSan (where you replace a fence with a discarded acquire load), then L252/L233 can both use Release.

(Just to make extra-sure, I verified that the change on L157 is needed with or without the tsan fence fix).

I'll have an updated patch shortly.

@thomcc
Copy link
Contributor Author

thomcc commented Aug 8, 2021

I've added a build.rs that detects tsan. An alternative way would be using a cargo feature, but this would be annoying to turn on (for example, if you only have a transitive dep that uses parking_lot, and only use nightly for builds under sanitizers or something).

If you don't like this way and prefer it be another feature, I can do that too.

@Amanieu Amanieu merged commit 6bbf522 into Amanieu:master Aug 9, 2021
@Amanieu
Copy link
Owner

Amanieu commented Aug 9, 2021

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants