Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KeyError on self._in_use_connections.remove(connection) #1138

Closed
jkamashian opened this issue Feb 20, 2019 · 10 comments · Fixed by #1270
Closed

KeyError on self._in_use_connections.remove(connection) #1138

jkamashian opened this issue Feb 20, 2019 · 10 comments · Fixed by #1270

Comments

@jkamashian
Copy link

Version: 2.10.5
Platform: Python 2.7, Ubuntu
This is a very rare occurrence on a very high traffic server but I receive a key error occasionally from here:
self._in_use_connections.remove(connection)
https://github.com/andymccurdy/redis-py/blob/2.10.5/redis/connection.py#L913
I haven't been able to reproduce this in testing or root out the cause.

Theories:
self._available_connections is a list so you could append in the same connection more then once. Then when you try to self._in_use_connections.add(connection)
https://github.com/andymccurdy/redis-py/blob/2.10.5/redis/connection.py#L898
You get no error for using the same connection. Until you try to release _in_use_connections is a set so were the add() multiple times desen't raise an error remove() will.

@andymccurdy
Copy link
Contributor

Can you elaborate on your setup a little more? Are you using redis-py in a forked or multiprocess or threaded environment?

@jkamashian
Copy link
Author

So this is a Flask(Nginx+) I think threaded Is the correct answer for this.

@Yanght115
Copy link

Yanght115 commented May 29, 2019

I have the same problem with python 2.7 + Django + uwsgi.

class RedisManager(Singleton):
    redis_conf = settings.BASE_PLUGIN_REDIS_CONFIG
    info_connection_pool = redis.ConnectionPool(
        host=redis_conf['info']['host'], port=redis_conf['info']['port'], decode_responses=True, db=0)

    @property
    def info_client(self):
        return redis.Redis(connection_pool=self.info_connection_pool)

Sometimes I get the following error

  File "/media/disk1/fordata/web_server/project/paas-pcs-platform/1.0.796-preonline/bem/base_plugin/util.py", line 62, in __init__
    self._init_redis_connection_pool()
  File "/media/disk1/fordata/web_server/project/paas-pcs-platform/1.0.796-preonline/bem/base_plugin/util.py", line 67, in _init_redis_connection_pool
    assert redis.Redis(connection_pool=self.status_connection_pool).ping(
  File "/data/web_server/project/paas-pcs-platform/venv/lib/python2.7/site-packages/redis/client.py", line 1037, in ping
    return self.execute_command('PING')
  File "/data/web_server/project/paas-pcs-platform/venv/lib/python2.7/site-packages/redis/client.py", line 784, in execute_command
    pool.release(connection)
  File "/data/web_server/project/paas-pcs-platform/venv/lib/python2.7/site-packages/redis/connection.py", line 1032, in release
    self._in_use_connections.remove(connection)
KeyError: Connection<host=xxxxx,port=xxxxx,db=0>

I guess the problem is caused by the Shared connection pool used by uwsgi workers ?

@kmerenkov
Copy link
Contributor

We get the same issue. Forked environment, all connections are made after fork.

@andymccurdy
Copy link
Contributor

@kmerenkov What version of redis-py are you using? 3.2.0+ fixes a lot of the problems with connections in forked processes.

@kmerenkov
Copy link
Contributor

What version of redis-py are you using?

3.2.1

@Yanght115
Copy link

 def reset(self):
        self.pid = os.getpid()
        self._created_connections = 0
        self._available_connections = []
        self._in_use_connections = set()
        self._check_lock = threading.Lock()

    def _checkpid(self):
        if self.pid != os.getpid():
            with self._check_lock:
                if self.pid == os.getpid():
                    # another thread already did the work while we waited
                    # on the lock.
                    return
                self.reset()

    def get_connection(self, command_name, *keys, **options):
        "Get a connection from the pool"
        self._checkpid()
        try:
            connection = self._available_connections.pop()
        except IndexError:
            connection = self.make_connection()

I guess there's one case, thread A and B, A get lock and then call reset(), when A set pid with self.pid = os.getpid() but not self._in_use_connections = set(), B enter the _checkpid() and fail to meet the condition if self.pid != os.getpid(), so B return and get connection from old _available_connections

@gmbnomis
Copy link
Contributor

I have the same issue (on 3.3.11).

I have no idea if this is the correct fix, but based on @yht804421715's observation, I moved the update of self.pid in reset() to the end:

    def reset(self):
        self._created_connections = 0
        self._available_connections = []
        self._in_use_connections = set()
        self._check_lock = threading.Lock()
        self.pid = os.getpid()

Since then, I can't reproduce it anymore.

@mdellweg
Copy link

@gmbnomis without fully understanding what is happening here, i think you should not move the pid-updating behind the discarding of the lock. From the comment i read that by setting the pid, the thread communicates, that is has done the work. So moving it behind clearing the containers is ok imho.
Anyway, i don't understand why the old lock should be discarded at all. Releasing it should be enough (and safer).

mdellweg pushed a commit to mdellweg/redis-py that referenced this issue Jan 23, 2020
This fixes a race condition, where a task can be returned an old
connection instead of waiting, while another one already started the reset.

Thanks for the analysis done by @yht804421715 and @gmbnomis .

fixes redis#1138
@andymccurdy
Copy link
Contributor

andymccurdy commented Jan 27, 2020

I'm not a fan of simply moving the .pid assignment below the other statements. It might work more than the released version but it's still not thread safe.

I just created a new PR #1270 that implements what I believe is a fully thread-safe pool. Can anyone please give it a spin and see if it fixes your issue? Thanks!

gmbnomis added a commit to gmbnomis/redis-py that referenced this issue Jan 28, 2020
BYK added a commit to getsentry/sentry that referenced this issue Mar 4, 2021
Came by this error on the forums: https://forum.sentry.io/t/worker-error-unable-to-incr-internal-metric/13098?u=byk and then found redis/redis-py#1138 which got fixed in version 3.4.0. This patch upgrades it to `3.4.1` which has a fix for a regression introduced in `3.4.0`. No breaking changes.
BYK added a commit to getsentry/sentry that referenced this issue Mar 10, 2021
Came by this error on the forums: https://forum.sentry.io/t/worker-error-unable-to-incr-internal-metric/13098?u=byk and then found redis/redis-py#1138 which got fixed in version 3.4.0. This patch upgrades it to `3.4.1` which has a fix for a regression introduced in `3.4.0`. No breaking changes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
6 participants