Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

deadlock: acquire multiple locks in the same order in clientconn #3047

Closed
skinowski opened this issue Sep 27, 2019 · 1 comment · Fixed by #3051
Closed

deadlock: acquire multiple locks in the same order in clientconn #3047

skinowski opened this issue Sep 27, 2019 · 1 comment · Fixed by #3051

Comments

@skinowski
Copy link

skinowski commented Sep 27, 2019

What version of gRPC are you using?

v1.24.0

What version of Go are you using (go version)?

go version go1.13 linux/amd64

What operating system (Linux, Windows, …) and version?

arch linux - kernel 5.3.1

What did you do?

We are using experimental ResetConnectBackoff() and we kick off gRPC requests
using FailFast(false) for a pool of connections. In some connections, ResetConnectBackoff() blocks exhibiting deadlock behavior. Quick look at the gRPC code reveals multiple mutex locking
order that is not recommended.

For example:

func (cc *ClientConn) ResetConnectBackoff() {
	cc.mu.Lock()
	defer cc.mu.Unlock()
	for ac := range cc.conns {
		ac.resetConnectBackoff()
                // where ac.resetConnectBackoff() calls ac.mu.Lock() for each ac.
	}
}
func (ac *addrConn) resetTransport() {
	for i := 0; ; i++ {

		ac.mu.Lock()

               // deleted code for clarity

		ac.updateConnectivityState(connectivity.Connecting)
                // ac.updateConnectivityState() calls, ac.cc.handleSubConnStateChange(), which
                // calls ac.cc.mu.Lock()

		ac.mu.Unlock()

       // deleted code for clarity

The pattern above exhibits a text book deadlock pattern as follows:

// ResetConnectBackoff() behavior:
m1.lock()
m2.lock()
m2.unlock()
m1.unlock()

// resetTransport() behavior:
m2.lock()
m1.lock()
m1.unlock()
m2.unlock()

What did you expect to see?

multiple locks must be acquired in the same order to avoid deadlock.

What did you see instead?

deadlock

@dfawley
Copy link
Member

dfawley commented Sep 27, 2019

Thanks for filing this issue! It should be fixed by #3051.

@lock lock bot locked as resolved and limited conversation to collaborators Mar 27, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants