Fix code that could lead to a possible deadlock. #1380

DusterTheFirst · 2022-03-19T17:52:46Z

This pull request aims to address some accidental double-read locks on RwLocks made by egui. When acquiring 2 read locks, if a thread in parallel happened to attempt to acquire a write lock between the two read locks, the second read lock will deadlock. This deadlock occurs due to RwLocks refusing any new read locks on a RwLock that has a pending Write lock which in most cases can lead to better fairness.

Each commit has a detailed description of the deadlock and how the changes prevent it.

Rather than taking one lock and reusing it, it could be possible to use something like parking_lot's read_recursive which would bypass the aforementioned second read lock from blocking if a write lock is queued, but in my opinion it would be less clear and maintainers would need to understand the difference between read() and read_recursive(). Using the already commonly understood scoping/drop system of Rust seems like the better choice.

Note: Each of these deadlocks have been found by accident, I did not do an extensive comb through the egui code for these situations so others are bound to exist in the codebase that I have not encountered. These deadlocks were just encountered in my code where a second thread was calling ctx.request_repaint very frequently.

Note 2: There is no way that I know of to prove that the deadlocks are gone, but the deadlocks have gone away in my application after these changes.

Drop implementations are not called until the end of a statement. The statement changed in this commit therefore took 4 read locks on a RwLock which can lead to problems if a write lock is requested between any of these read locks. The code looks like it would only hold one lock at a time but it does not drop any of the locks until after the arithmatic operations complete, which leads to this situation. See https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=996079046184329f3a9df1cd19c87da8 to see this in action. The fix is to just take one lock and share it between the three calls to num_presses, letting it drop at the end of the scope.

The issue here is related to that in 9673b8f in that the lock is not dropped when it is expected. Since the `RwLockReadGuard` produced by `ctx.input()` has a reference taken from it (one into `pointer`), the lock cannot be dropped until that reference is no longre valid, which is the end of the scope (in this case this function). The following `ctx.input()` then attempts to aquire a second lock on the `RwLock`, creating the same situation that was found in the referenced commit. This has been resolved by holding one lock on the input for the whole function.

emilk

Good catch!

egui/src/menu.rs

egui/src/widgets/slider.rs

emilk · 2022-03-21T15:44:11Z

Another solution is to use a standard Mutex instead of a RwLock for the ContextImpl.

emilk · 2022-03-21T15:44:27Z

This problem would be mitigated by

Put repaint callback and tex manager behind separate mutexes #1389

https://github.com/asny/three-d recently merged a PR adding `glow` support: asny/three-d#210 This means it is a prime candidate for embedding 3D painting inside an eframe app. There are currently a few kinks that need to be figured out: ### Black screen When reusing the same three_d context over time (as one should), we only get one frame of egui together with three_d, and then after that a black screen with just the three_d painting on top. I need to fix that before merging this PR. ### `Shape: Send + Sync` `Shape` is `Send + Sync` and `three_d::Context` is not. This means we cannot store a three_d context and send it to the `Shape::Callback`. So we either need to recreate the three_d context each frame (obviously a bad idea), or access it through a `thread_local` hack. This PR adds both as examples, with a checkbox to switch. We could consider making `Shape: !Send + !Sync`, but that would mean `egui::Context` could not be `Send+Sync` either (because the egui context stores shapes). This could actually be fine. `egui::Context` should only be used from a background thread for calling request_repaint and allocating textures. These could be made separate parts of the egui Context, so that one would do: ``` rust let repaint_signal = egui_ctx.repaint_signal(); let tex_mngr = egui_ctx.tex_mngr(); std::thread::spawn(move || { // We can use repaint_signal and tex_mngr here, // but NOT `egui_ctx`. }): Related: * #1379 * #1380 * #1389

https://github.com/asny/three-d recently merged a PR adding `glow` support: asny/three-d#210 This means it is a prime candidate for embedding 3D painting inside an eframe app. There are currently a few kinks that need to be figured out: When reusing the same three_d context over time (as one should), we only get one frame of egui together with three_d, and then after that a black screen with just the three_d painting on top. I need to fix that before merging this PR. `Shape` is `Send + Sync` and `three_d::Context` is not. This means we cannot store a three_d context and send it to the `Shape::Callback`. So we either need to recreate the three_d context each frame (obviously a bad idea), or access it through a `thread_local` hack. This PR adds both as examples, with a checkbox to switch. We could consider making `Shape: !Send + !Sync`, but that would mean `egui::Context` could not be `Send+Sync` either (because the egui context stores shapes). This could actually be fine. `egui::Context` should only be used from a background thread for calling request_repaint and allocating textures. These could be made separate parts of the egui Context, so that one would do: ``` rust let repaint_signal = egui_ctx.repaint_signal(); let tex_mngr = egui_ctx.tex_mngr(); std::thread::spawn(move || { // We can use repaint_signal and tex_mngr here, // but NOT `egui_ctx`. }): Related: * #1379 * #1380 * #1389

DusterTheFirst added 4 commits March 19, 2022 18:44

Reference this PR from comments in the code for future maintainers

11ed9f2

Add the change to the changelog

7a0ccfd

DusterTheFirst marked this pull request as ready for review March 19, 2022 18:59

emilk approved these changes Mar 20, 2022

View reviewed changes

egui/src/menu.rs Outdated Show resolved Hide resolved

egui/src/widgets/slider.rs Outdated Show resolved Hide resolved

emilk added 2 commits March 20, 2022 10:03

Use full link to PR

7d393c1

Use full link to PR

7e820d9

emilk merged commit 8bb381d into emilk:master Mar 20, 2022

DusterTheFirst deleted the remove-deadlock branch March 20, 2022 19:35

This was referenced Mar 21, 2022

Remove the single_threaded/multi_threaded feature flags #1390

Merged

Put repaint callback and tex manager behind separate mutexes #1389

Closed

This was referenced Mar 22, 2022

Add example using three-d #1398

Closed

Consider making egui::Context !Send+!Sync #1399

Open

MaximOsipenko mentioned this pull request Aug 27, 2022

Fix bit more code that could lead to a possible deadlock. #1968

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix code that could lead to a possible deadlock. #1380

Fix code that could lead to a possible deadlock. #1380

DusterTheFirst commented Mar 19, 2022 •

edited

emilk left a comment

emilk commented Mar 21, 2022 •

edited

emilk commented Mar 21, 2022 •

edited

Fix code that could lead to a possible deadlock. #1380

Fix code that could lead to a possible deadlock. #1380

Conversation

DusterTheFirst commented Mar 19, 2022 • edited

emilk left a comment

Choose a reason for hiding this comment

emilk commented Mar 21, 2022 • edited

emilk commented Mar 21, 2022 • edited

DusterTheFirst commented Mar 19, 2022 •

edited

emilk commented Mar 21, 2022 •

edited

emilk commented Mar 21, 2022 •

edited