deadlocks in `Channel.call(...)` #253

jxsl13 · 2024-03-12T20:36:18Z

Describe the bug

Hi,
I'm (still) developing a wrapper for this library.
I'm trying to properly implement flow control and context handling and trying to test as much as possible, simulating connection loss and more.

For one of my tests I have a rabbitmq which is out of memory upon startup which triggers the connection blocking state.

This state seems to trigger some weird deadlocks or something along the lines in this library making this select statement block "forever":

amqp091-go/channel.go

Lines 181 to 205 in a2fcd5b

    
           select { 
        
           case e, ok := <-ch.errors: 
        
           	if ok { 
        
           		return e 
        
           	} 
        
           	return ErrClosed 
        
           case msg := <-ch.rpc: 
        
           	if msg != nil { 
        
           		for _, try := range res { 
        
           			if reflect.TypeOf(msg) == reflect.TypeOf(try) { 
        
           				// *res = *msg 
        
           				vres := reflect.ValueOf(try).Elem() 
        
           				vmsg := reflect.ValueOf(msg).Elem() 
        
           				vres.Set(vmsg) 
        
           				return nil 
        
           			} 
        
           		} 
        
           		return ErrCommandInvalid 
        
           	} 
        
           	// RPC channel has been closed without an error, likely due to a hard 
        
           	// error on the Connection.  This indicates we have already been 
        
           	// shutdown and if were waiting, will have returned from the errors chan. 
        
           	return ErrClosed 
        
           }

I have seen Channel.Close() and Channel.UnbindQueue(...) block "forever".
The blocking of Channel.UnbindQueue(...) is reproduced in the test below.

Might be related to #225 (it might be possible to reproduce "turn off the internet" with the tool that I use for my tests that's called toxiproxy)

Reproduction steps

Here is a test that reproduces the problem:

have docker & docker compose:
make environment
execute test: ~~https://github.com/jxsl13/amqpx/blob/feat/the-context-update/pool/session_test.go#L732-L885~~ https://github.com/jxsl13/amqpx/blob/main/pool/session_test.go#L682-L837

level=info, msg=creating connection,
level=info, msg=registering flow control notification channel,
level=info, msg=creating channel,
level=info, msg=registering error notification channel,
level=info, msg=registering confirms notification channel,
level=info, msg=registering flow control notification channel,
level=info, msg=registering returned message notification channel,
level=info, msg=declaring exchange,
level=info, msg=declaring queue,
level=info, msg=binding queue,
level=info, msg=publishing message,
level=info, msg=unbinding queue,  (blocks here forever)

Expected behavior

QueueBind worked, so I guess QueueUnbind should also work.
I think this behavior can be triggered for nearly every method of Channel.

Additional context

Should not be relevant but could:
darwin/arm64
macOS 14.3.1

The text was updated successfully, but these errors were encountered:

lukebakken · 2024-03-12T21:54:16Z

Thanks for the report and the steps to reproduce this issue. I can reproduce it. As you noted, it requires a blocked RabbitMQ to reproduce.

amotzte · 2024-04-02T11:37:02Z

I'm pretty sure I got the something similaron QueueBind. Calling to QueueBind with noWait=false and getting stuck forever. Would it make sense to add some timeout for this operation ?

jxsl13 added the bug Something isn't working label Mar 12, 2024

lukebakken self-assigned this Mar 12, 2024

jxsl13 changed the title ~~random deadlocks in Channel.call(...)~~ deadlocks in Channel.call(...) Mar 12, 2024

jxsl13 mentioned this issue Mar 15, 2024

re-enable out of memory rabbitmq tests jxsl13/amqpx#54

Open

lukebakken mentioned this issue Mar 21, 2024

Data race in the client example #72

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

deadlocks in `Channel.call(...)` #253

deadlocks in `Channel.call(...)` #253

jxsl13 commented Mar 12, 2024 •

edited

lukebakken commented Mar 12, 2024

amotzte commented Apr 2, 2024

deadlocks in Channel.call(...) #253

deadlocks in Channel.call(...) #253

Comments

jxsl13 commented Mar 12, 2024 • edited

Describe the bug

Reproduction steps

Expected behavior

Additional context

lukebakken commented Mar 12, 2024

amotzte commented Apr 2, 2024

deadlocks in `Channel.call(...)` #253

deadlocks in `Channel.call(...)` #253

jxsl13 commented Mar 12, 2024 •

edited