AttributeError: 'NoneType' object has no attribute 'close' #1345

rolette · 2020-05-15T16:27:25Z

Unfortunately, the fix for #732 doesn't correctly handle master node failover for multi-threaded code.

Traceback (most recent call last):
  # snip
  File "/usr/local/lib/python3.5/site-packages/redis/client.py", line 1870, in blpop
    return self.execute_command('BLPOP', *keys)
  File "/usr/local/lib/python3.5/site-packages/redis/client.py", line 884, in execute_command
    conn.send_command(*args)
  File "/usr/local/lib/python3.5/site-packages/redis/connection.py", line 721, in send_command
    check_health=kwargs.get('check_health', True))
  File "/usr/local/lib/python3.5/site-packages/redis/connection.py", line 692, in send_packed_command
    self.connect()
  File "/usr/local/lib/python3.5/site-packages/redis/sentinel.py", line 44, in connect
    self.connect_to(self.connection_pool.get_master_address())
  File "/usr/local/lib/python3.5/site-packages/redis/sentinel.py", line 107, in get_master_address
    self.disconnect()
  File "/usr/local/lib/python3.5/site-packages/redis/connection.py", line 1241, in disconnect
    connection.disconnect()
  File "/usr/local/lib/python3.5/site-packages/redis/connection.py", line 669, in disconnect
    self._sock.close()
AttributeError: 'NoneType' object has no attribute 'close'

Scenario is this:

multi-threaded process running, using a SentinelConnectionPool to get their separate connections to the Redis master
Redis master goes down, Sentinel does its job and elects a new master
Subsequent redis-py call ends up calling SentinelConnectionPool.get_master_address()
get_master_address() notices that there is a new master and calls self.disconnect() to flush the pool
ConnectionPool.disconnect() calls disconnect() on each of the connections in the pool.
Connection.disconnect() sees a matching PID and calls shutdown() on the socket, then sets it back to None
"Fun" ensues...

There is thread-locking to protect the pool's management of its members, but ConnectionPool.disconnect() is ripping sockets out from other threads in the middle of other operations.

The actual stack trace and error you'll get will vary depending on timing.

To fix it in my product code, I'm re-integrating the deferred disconnect using a generation id from PR #784.

Version: redis-py v3.4.1, redis/sentinel v3.2.13
Platform: Python 3.5.2 on Linux

The text was updated successfully, but these errors were encountered:

andymccurdy · 2020-05-19T23:05:40Z

I've been digging into this for a bit this afternoon. I didn't write redis-py's sentinel implementation so this is a good exercise for me to become more familiar with what's happening.

First, a question: Are you calling Sentinel.master_for() each time you need a connection to the master? Or do you call it once when your app starts and then use that object through the life of the application?

Here are some observations after reading the code.

If the Sentinels elect a new master, it seems likely that the clients connected to the old master are going to have various connection problems. Even if they are still connected to the old master (e.g., a network partition occurred where clients can still see the old master but the sentinels cannot), allowing them to write to the old master would likely result in data loss. Given that, it seems like calling SentinelConnectionPool.disconnect() might be the correct thing to do, though perhaps we could do this more gracefully.
However, it seems like the SentinelConnectionPool doesn't work well when a new master is elected. SentinelConnectionPool.master_address is only assigned in SentinelConnectionPool.reset() and SentinelConnectionPool.get_master_address(). reset() only gets called from SentinelConnectionPool.__init__(), and get_master_address() only assigns a value to master_address if the current value is None. So after a new master is elected, every subsequent call to SentinelConnectionPool.get_master_address() is going to fail in the master_address != self.master_address check and subsequently disconnect all the connections in the pool. This seems terrible, but easily fixable by simply assigning the new master_address in this step.

If we fix item 2, there will still be a hiccup when the new master is elected, but then all the clients should get connected to the new master and work correctly.

rolette · 2020-05-20T00:07:31Z

Hi Andy,

First, a question: Are you calling Sentinel.master_for() each time you need a connection to the master? Or do you call it once when your app starts and then use that object through the life of the application?

I'm just calling redis.hget() (for example), where the Redis instance was instantiated using a SentinelConnectionPool. The SentinelManagedConnection ends up doing a get_master_address() in its connect() method. Just stock redis-py behavior with sentinel.

If the Sentinels elect a new master, it seems likely that the clients connected to the old master are going to have various connection problems.

Yep. Clients have to deal with various errors here (mostly timeout and connection errors). However, since execute_command() ends up calling get_master_address() each time instead of caching it, the connection automatically disconnects from the old master and connects to the new one. Occasionally you have to deal with a MasterNotFoundError, but in our product code, we have a wrapper that automatically retries the operation in that case.

Even if they are still connected to the old master (e.g., a network partition occurred where clients can still see the old master but the sentinels cannot), allowing them to write to the old master would likely result in data loss. Given that, it seems like calling SentinelConnectionPool.disconnect() might be the correct thing to do, though perhaps we could do this more gracefully.

I haven't found a good way to safely disconnect in-use connections in the pool other than using a generation id and reaping them when they are released back to the pool. Clients expect to deal with normal network and service outage errors, but they shouldn't have to be dealing with TypeError and AttributeError exceptions on a connection they own.

If it's helpful, I can post my patch that I'm using to fix the disconnect() problem. It's not materially different than the PR I submitted previously other than leaving out the fixes for forked processes, which your fixes seem to handle fine.

However, it seems like the SentinelConnectionPool doesn't work well when a new master is elected...

Ah, I forgot PR #847 hasn't been merged into master yet. I have that merged into my tree. It fixes that problem :)

andymccurdy · 2020-05-20T18:17:31Z

Merged #847 as that clearly needed to be fixed. If you want to post what you have for the generation ID stuff I'll take a look.

rolette · 2020-05-20T19:36:04Z

Great. Patch is attached, very simple.
redis_py_1345.patch.txt

andymccurdy · 2020-05-29T19:43:12Z

@rolette I pushed the https://github.com/andymccurdy/redis-py/tree/sentinel-1345 branch that I believe fixes this. It uses a slightly different approach as your solution, but I believe it results in the same basic behavior of throwing a stale connection out when it's returned to the pool. It also fixes a bug where the base ConnectionPool class wasn't decrementing its _created_connections counter when it needs to throw connections away.

rolette · 2020-05-30T14:58:42Z

@andymccurdy Reviewed your changes and added my $0.02.

Thanks!

andymccurdy · 2020-06-01T18:47:06Z

@rolette I added an optional inuse_connections=True arg to ConnectionPool.disconnect. By default it will work identical to before. Passing inuse_connections=False will only disconnect idle connections that aren't leased out by the pool.

When the SentinelConnectionPool detects the master address has changed, it will call self.disconnect(inuse_connections=False) to disconnect all the idle connections.

Have I missed anything?

rolette · 2020-06-01T19:57:09Z

@andymccurdy I'd make inuse_connections default to False since the old behavior would result in the same errors that you are fixing here. If they really want to force the disconnect and deal with the errors, then make that the exception case.

Have I missed anything?
I think the only feedback you haven't addressed is my comment about restoring the self.disconnect() in get_master_address() when a new master is detected. From my experience, it would be better to flush "available" connections in the pool right then rather than having to hit them individually as they are requested down the road.

andymccurdy · 2020-06-01T20:00:37Z

@rolette I made inuse_connection=True the default to preserve backwards compatibility. I'm happy to change that behavior in the next major version.

I am calling self.disconnect(inuse_connections=False) in get_master_address() here: 811cdd0#diff-ea3ce4482d1ecb8a23d1cce2ec6e9b48R113

rolette · 2020-06-01T20:11:04Z

@andymccurdy Ok, that looks good. Appreciate the fixes!

andymccurdy · 2020-06-01T20:16:47Z

Great, I'll merge this and release 3.5.3

andymccurdy · 2020-06-01T21:30:46Z

Just released 3.5.3 with this fix.

andymccurdy closed this as completed in e06af2a Jun 1, 2020

dependabot bot mentioned this issue Mar 17, 2021

Bump redis from 3.4.1 to 3.5.3 in /requirements tchellomello/awx#8

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AttributeError: 'NoneType' object has no attribute 'close' #1345

AttributeError: 'NoneType' object has no attribute 'close' #1345

rolette commented May 15, 2020

andymccurdy commented May 19, 2020 •

edited

rolette commented May 20, 2020 •

edited

andymccurdy commented May 20, 2020

rolette commented May 20, 2020

andymccurdy commented May 29, 2020

rolette commented May 30, 2020

andymccurdy commented Jun 1, 2020

rolette commented Jun 1, 2020

andymccurdy commented Jun 1, 2020

rolette commented Jun 1, 2020

andymccurdy commented Jun 1, 2020

andymccurdy commented Jun 1, 2020

AttributeError: 'NoneType' object has no attribute 'close' #1345

AttributeError: 'NoneType' object has no attribute 'close' #1345

Comments

rolette commented May 15, 2020

andymccurdy commented May 19, 2020 • edited

rolette commented May 20, 2020 • edited

andymccurdy commented May 20, 2020

rolette commented May 20, 2020

andymccurdy commented May 29, 2020

rolette commented May 30, 2020

andymccurdy commented Jun 1, 2020

rolette commented Jun 1, 2020

andymccurdy commented Jun 1, 2020

rolette commented Jun 1, 2020

andymccurdy commented Jun 1, 2020

andymccurdy commented Jun 1, 2020

andymccurdy commented May 19, 2020 •

edited

rolette commented May 20, 2020 •

edited