Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AsyncSemaphore fix for deadlock in narrow cancellation race condition #1001

Merged
merged 1 commit into from Mar 25, 2022

Conversation

AArnott
Copy link
Member

@AArnott AArnott commented Mar 24, 2022

This avoids calling CancellationTokenRegistration.Dispose while holding a private lock in EnterAsync. There is one remaining case where we may hold a lock while calling it, but that's in the inlined case of EnterAsync and CTR will have its default value, and thus could not have anything to block on.

The problematic case that we're fixing is when the token is canceled while EnterAsync is executing between where the CTR is assigned and where we called info.Cleanup(). In such a case, we would deadlock because Cleanup would wait for handlers to complete but the handler is waiting for the lock.

There is no unit test change as part of this because the race condition is so very narrow that we couldn't write a unit test that would reliably hit it. However, I did temporarily make a code change that would throw when we held a lock in the evil case and our stress test did hit it before the fix and did not hit it after the fix, so I feel with that manual testing that we likely resolved the issue.

Fixes #1000

This avoids calling `CancellationTokenRegistration.Dispose` while holding a private lock in EnterAsync. There is one remaining case where we may hold a lock while calling it, but that's in the inlined case of EnterAsync and CTR will have its default value, and thus could not have anything to block on.

The problematic case that we're fixing is when the token is canceled while EnterAsync is executing between where the CTR is assigned and where we called `info.Cleanup()`. In such a case, we would deadlock because Cleanup would wait for handlers to complete but the handler is waiting for the lock.

Fixes microsoft#1000
@AArnott AArnott added this to the v17.2 milestone Mar 24, 2022
Copy link
Member

@jasonmalinowski jasonmalinowski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems reasonable to me, given this is the standard "dispose outside of the lock" pattern. I won't pretend to understand the precise data structures involved here but nothing looks amiss.

Copy link
Member

@lifengl lifengl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the change looks good to me.

@AArnott AArnott merged commit a66c986 into microsoft:main Mar 25, 2022
@AArnott AArnott deleted the fix1000 branch March 25, 2022 13:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

AsyncSemaphore deadlock: EnterAsync holds lock while blocking on cancellation handler which needs lock
4 participants