Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to reliably detect reentrancy #65

Open
pitdicker opened this issue Oct 21, 2019 · 8 comments
Open

How to reliably detect reentrancy #65

pitdicker opened this issue Oct 21, 2019 · 8 comments

Comments

@pitdicker
Copy link
Contributor

Your RFC states "It is an error to reentrantly initialize the cell from f. Doing so results in a panic." Currently sync:OnceCell does not give that guarantee. I tried to see how a solution would look like, but haven't found something easy yet.

Option 1 was to use a ReentrantMutex, and an atomic. Suppose the atomic says an initializer is running. If we are able to lock the mutex, this must be a case of reentrant initialization. Otherwise the mutex will block, which is what we want in case another thread is busy initializing. See my not working experiment in https://github.com/pitdicker/once_cell/commit/bd4a940c94b41ee2b6f595854ea2565e0a76a73e. parking_lot does not expose a const {Raw}ReentrantMutex::new.

Option 2 is using some sort of thread id, which is stored in the OnceCell when the initializer starts to run. When another initializer sees the OnceCell is in the process of being initialized, it can compare the stored thread id to its own, and choose between blocking or panicking. Interestingly parking_lot::ReentrantMutex also needs some sort of thread id to function. I don't yet have working code though.

The disadvantage of both options is it would add quite some complexity, and 4 or 8 bytes to the size of the OnceCell. Is that worth the trouble?

Also getting a thread id is not ideal: std::process::id is not supported on all platforms (like CloudABI). And std::thread::Thread::id is an opaque type, so it can't (officially) be stored in an atomic, but needs a Cell and manual synchronisation.

@matklad
Copy link
Owner

matklad commented Oct 21, 2019

Hm, yeah, I think that's a poor wording in the RFC text. For cell case, we guarantee a panic. For sync case, we say "this is a bug, potential outcomes are a panic or a deadlock".

@matklad
Copy link
Owner

matklad commented Oct 21, 2019

BTW, I've added you to my fork of RFCs repo: you seem to be genuinely interested in the topic, so feel free to edit RFC directly/send PRs :)

@pitdicker
Copy link
Contributor Author

Thank you! I'll think about it a bit first, I am unreliable... 😄

@matklad
Copy link
Owner

matklad commented Oct 21, 2019

As to how to actually detect a deadlock, I think there were some vague plans to add deadlock detection to parking-lot proper via a flag, and that seems like the right approach here, except for the fact that feature flags obviously don't work with std.

@pitdicker
Copy link
Contributor Author

pitdicker commented Oct 28, 2019

Just to write it down for now: this a way to detect reentrancy without increasing the size of OnceCell using the std method.

Currently it uses a linked list of waiting threads. When ready the managing (initializing) thread wakes up all the other threads, which may then free their Waiter node. The only thread that can safely walk the linked list is the managing thread, because it is the only one who knows when they are all still waiting.

Suppose this is changed to a scheme where not the managing thread wakes up everyone, but every thread wakes up the next one. Now it is safe for every thread to always walk the list up to the first node.

If the first node is created by the thread that started to run the initialization function, it can store some information in that node, like the thread id. A thread that wants to wait on initialization adds itself to the linked list, walks the queue to the first node, and only parks itself if its thread id doesn't match that of the first node.

But I am not sure what effect it has to make every thread responsible for waking up the next one. I guess it will perform different from one thread waking up all others, but wouldn't know if it's worse.

@pitdicker
Copy link
Contributor Author

The break_it test that I added in #71 turns out to be a good stress test to time the performance of parking/unparking under contention.
Rough results:

x86_64 aarch64
std, one thread wakes the others 475ms 215ms
std, every thread wakes the next 400ms 200ms
parking lot mutex 1.10s 125ms

The test used 32 threads. My two test systems are AMD Ryzen 1950X (16-core) and Hisilicon Kirin 970 (4-core I think?).
The results on aarch64 are from my branch in #71 that uses consume ordering.

One funny result is that my phone runs this test faster than a good computer. But I think the explanation is that with just 4 threads, there is less contention and less threads waiting, that have to be managed. The lower the number of threads I use in the test, the faster it runs.

I'm not sure why using parking_lot is so much slower in this test on x86_64, it consistently takes 2-3x longer.

The more interesting result is that changing to a method where every thread wakes up the next one seems to be faster. So we can have our cake and eat it too (reentrancy detection and faster unparking under contention).

@nbdd0121
Copy link

I believe just keeping a std::thread::Thread.id would be sufficient for reentrance detection.

@nbdd0121
Copy link

I'll have a look next week to see if I can add reliable reentrance detection. Hopefully it'll provide a piece of code that can be shared among both OnceCell and Once.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants