Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] RLock called twice in GetClosestN and cause the deadlock #23

Closed
namco1992 opened this issue Nov 9, 2022 · 1 comment
Closed
Assignees
Labels
bug Something isn't working

Comments

@namco1992
Copy link

We recently encountered a deadlock with the library. After some digging, we found out that it tries to acquire the read lock in getClosetN, and then in GetPartitionOwner.

consistent/consistent.go

Lines 302 to 312 in f0660af

func (c *Consistent) getClosestN(partID, count int) ([]Member, error) {
c.mu.RLock()
defer c.mu.RUnlock()
res := []Member{}
if count > len(c.members) {
return res, ErrInsufficientMemberCount
}
var ownerKey uint64
owner := c.GetPartitionOwner(partID)

consistent/consistent.go

Lines 284 to 287 in f0660af

func (c *Consistent) GetPartitionOwner(partID int) Member {
c.mu.RLock()
defer c.mu.RUnlock()

It's not safe to call RLock and RUnlock twice in one goroutine while having another goroutine call Lock simultaneously. The discussion here and here explain it better than me. In addition, to quote from the RWMutex documentation:

If a goroutine holds a RWMutex for reading and another goroutine might call Lock, no goroutine should expect to be able to acquire a read lock until the initial read lock is released. In particular, this prohibits recursive read locking.

You can also go to this Go playground and replicate it yourself: https://go.dev/play/p/pEhnPbsnDG

The probability is low since it requires a Lock happens immediately after the first RLock, but it would happen when you have a large amount of reads/writes and run for a sufficiently long time.

@buraksezer buraksezer self-assigned this Nov 9, 2022
@buraksezer buraksezer added the bug Something isn't working label Nov 9, 2022
buraksezer added a commit that referenced this issue Nov 9, 2022
fix: eliminate the deadlock issue mentioned in #23
@buraksezer
Copy link
Owner

Hello,

Thank you for reporting this. Please upgrade to v0.10.0. If you have any further problems, please don't hesitate to open an issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants